/Performance modeling of dynamically optimized scale out AI inference

Performance modeling of dynamically optimized scale out AI inference

Internship/thesis - Leuven | Just now

Perform at the edge of the AI industry, briding the gap between chips and software

With AI finding its place in our daily lives, AI inference is expected to dominate workloads in the future. Its adoption and success is highly dependent on its accuracy, but also the performance of its inference calculations on the available hardware, to make AI economically viable, accessible, more useful, ...
Recently, it has been shown that AI inference costs have dropped 10x (e.g. LLM inference prices have fallen rapidly but unequally across tasks | Epoch AI). Of course, this highly depends on the use cases, but it is clear that optimizations are effective. One such domain of optimizations is in the concept of dynamically serving AI inference, optimizing the workload across multiple inference steps and multiple inferences from many users.

This internship focuses on introducing, exploring and improving dynamic inference optimizations within the context of a broader performance modeling tool for AI datacenters.

We are looking forward to meeting interested candidates that ideally can demonstrate experience or insights in performance modeling and analysis in the context of computer systems and AI.

 

Master's degree: Master of Engineering Science, Master of Engineering Technology, Master of Science

Required educational background: Computer Science, Electrotechnics/Electrical Engineering

Duration: 3-12 months

For more information or application, please contact the supervising scientists Timon Evenblij (timon.evenblij@imec.be) and Wenzhe Guo (wenzhe.guo@imec.be). 

 

Imec allowance will be provided for students studying at a non-Belgian university.

 

 

Who we are
Accept analytics-cookies to view this content.
imec's cleanroom
Accept analytics-cookies to view this content.

Send this job to your email