/Evaluating advanced 3D integration for SRAM-centric inference accelerators

Evaluating advanced 3D integration for SRAM-centric inference accelerators

Master internship - Leuven | Just now

Explore how advanced 3D integration could reshape AI inference systems.
Evaluating advanced 3D integration for SRAM-centric
inference accelerators


Supervisors: Erwan Lenormand, Debjyoti Bhattacharjee


Context
The increasing deployment of generative-AI workloads across edge and cloud environments is  intensifying the need for higher inference throughput under stringent power and cost constraints. For LLMs, GPU-based inference performance is commonly bounded by memory bandwidth, as repeated transfers of large weight tensors from DRAM create a Von Neumann bottleneck. This has motivated the development of SRAM-centric inference accelerators, such as IBM’s NorthPole, Groq’s LPU, and Cerebras’ WSE, that store model parameters on chip to reduce data movement. However, the limited SRAM capacity available per chip requires partitioning model parameters across multiple devices, resulting in expensive systems. 3D integration could mitigate this limitation by enabling substantially higher SRAM capacity per chip, thereby improving performance, energy efficiency, and cost effectiveness for SRAM-centric architectures.


imec is developing advanced 3D integration technologies. This project aims to establish a reference compute platform to assess how such 3D technologies could affect the performance, power, and area (PPA) of SRAM-centric inference accelerators. The platform will be assembled by leveraging open-source hardware and software components. In the first phase, open-source hardware IP will be modified to explore alternative architectural configurations. In the second phase, representative inference workloads will be deployed and benchmarked on the platform. In the final phase, a PPA analysis will be conducted across selected workloads to quantify the impact and identify the most promising design trade-offs.


Objectives
• Model the instruction memory and control flow of the accelerator
• Develop an interface for integrating the AI inference accelerator using SST (simulation framework)
• Initialize and execute workloads on the accelerator using RISC-V core (modelled using gem5)
What skills do you need to apply?
• Experience with RTL design. Prior experience with Chisel is a plus.
• Familiarity with Python.
• Enthusiasm for artificial intelligence and compute architecture.
What skills will you acquire?
• Familiarity with technology/hardware co-optimization.
• Experience with design of inference accelerators.
• Following a scientific approach to tackle research problems.



The project will be supervised by researchers at imec in the Compute System Architecture department. imec is a world-renowned research center for nano-electronics and digital technologies, based in Leuven, Belgium. The student is expected to work from the imec Leuven, Belgium campus full time for a period of 6 months, beginning as soon as possible.


Contact: Erwan.Lenormand@imec.be, debjyoti.bhattacharjee@imec.be

Type of internship: Master internship

Duration: 6 months

Required educational background: Computer Science

Supervising scientist(s): For further information or for application, please contact Debjyoti Bhattacharjee (Debjyoti.Bhattacharjee@imec.be) and Erwan Lenormand (Erwan.Lenormand@imec.be)

The reference code for this position is 2026-INT-015. Mention this reference code in your application.

Imec allowance will be provided for students studying at a non-Belgian university.


Applications should include the following information:

  • resume
  • motivation
  • current study

Incomplete applications will not be considered.
Who we are
Accept analytics-cookies to view this content.
imec's cleanroom
Accept analytics-cookies to view this content.

Send this job to your email