/Performance modeling of dynamically optimized scale out AI inference

Performance modeling of dynamically optimized scale out AI inference

Internship/thesis - Leuven | Just now

Perform at the edge of the AI industry, briding the gap between chips and software

With AI finding its place in our daily lives, AI inference is expected to dominate workloads in the future. Its adoption and success is highly dependent on its accuracy, but also the performance of its inference calculations on the available hardware, to make AI economically viable, accessible, more useful, ...
Recently, it has been shown that AI inference costs have dropped 10x (e.g. LLM inference prices have fallen rapidly but unequally across tasks | Epoch AI). Of course, this highly depends on the use cases, but it is clear that optimizations are effective. One such domain of optimizations is in the concept of dynamically serving AI inference, optimizing the workload across multiple inference steps and multiple inferences from many users.

This internship focuses on introducing, exploring and improving dynamic inference optimizations within the context of a broader performance modeling tool for AI datacenters.

We are looking forward to meeting interested candidates that ideally can demonstrate experience or insights in performance modeling and analysis in the context of computer systems and AI.

Master's degree: Master of Engineering Science, Master of Engineering Technology, Master of Science

Required educational background: Computer Science, Electrotechnics/Electrical Engineering

Duration: 3-12 months

For more information or application, please contact the supervising scientists Timon Evenblij (timon.evenblij@imec.be) and Wenzhe Guo (wenzhe.guo@imec.be).

Imec allowance will be provided for students studying at a non-Belgian university.

Apply

Who we are

Accept analytics-cookies to view this content.

imec's cleanroom

Accept analytics-cookies to view this content.

Related jobs

Energy-Efficient Floating Supply Generator Circuit for Next-Generation Neurotechnology (NNFC internship only)

Design of key analog circuit blocks for next-generation stimulation artifact-tolerant closed-loop neural interface IC.

IGZO based circuitry exploration for CMOS2.0 Circuits

Bridging logic and memory in CMOS 2.0 with IGZO: from capacitor-less eDRAM cells to heterogeneous system integration.

Event-based modeling of high-performance data converters and frequency synthesizers.

Explore the framework for system-level design of high-speed integrated circuits.

Job opportunities

Share this article on

Performance modeling of dynamically optimized scale out AI inference

Who we are

imec's cleanroom

Related jobs

Workload modeling in AI datacenters

Early-Stage Thermal Screening Methodology for Advanced 3D HBM–GPU Integration

Optical interconnects for hyperscale GPU cluster

Energy-Efficient Floating Supply Generator Circuit for Next-Generation Neurotechnology (NNFC internship only)

IGZO based circuitry exploration for CMOS2.0 Circuits

Event-based modeling of high-performance data converters and frequency synthesizers.

Send this job to your email

Expertise

What we offer

Applications

Jobs

About imec

More imec