Evaluating advanced 3D integration for SRAM-centric
Supervisors: Erwan Lenormand, Debjyoti Bhattacharjee
The increasing deployment of generative-AI workloads across edge and cloud environments is intensifying the need for higher inference throughput under stringent power and cost constraints. For LLMs, GPU-based inference performance is commonly bounded by memory bandwidth, as repeated transfers of large weight tensors from DRAM create a Von Neumann bottleneck. This has motivated the development of SRAM-centric inference accelerators, such as IBM’s NorthPole, Groq’s LPU, and Cerebras’ WSE, that store model parameters on chip to reduce data movement. However, the limited SRAM capacity available per chip requires partitioning model parameters across multiple devices, resulting in expensive systems. 3D integration could mitigate this limitation by enabling substantially higher SRAM capacity per chip, thereby improving performance, energy efficiency, and cost effectiveness for SRAM-centric architectures.
imec is developing advanced 3D integration technologies. This project aims to establish a reference compute platform to assess how such 3D technologies could affect the performance, power, and area (PPA) of SRAM-centric inference accelerators. The platform will be assembled by leveraging open-source hardware and software components. In the first phase, open-source hardware IP will be modified to explore alternative architectural configurations. In the second phase, representative inference workloads will be deployed and benchmarked on the platform. In the final phase, a PPA analysis will be conducted across selected workloads to quantify the impact and identify the most promising design trade-offs.
• Model the instruction memory and control flow of the accelerator
• Develop an interface for integrating the AI inference accelerator using SST (simulation framework)
• Initialize and execute workloads on the accelerator using RISC-V core (modelled using gem5)
What skills do you need to apply?
• Experience with RTL design. Prior experience with Chisel is a plus.
• Familiarity with Python.
• Enthusiasm for artificial intelligence and compute architecture.
What skills will you acquire?
• Familiarity with technology/hardware co-optimization.
• Experience with design of inference accelerators.
• Following a scientific approach to tackle research problems.
The project will be supervised by researchers at imec in the Compute System
Architecture department. imec is a world-renowned research center for nano-electronics and digital technologies, based in Leuven, Belgium. The student is expected to work from the imec Leuven, Belgium campus full time for a period of 6 months, beginning as soon as possible.
Contact: Erwan.Lenormand@imec.be, debjyoti.bhattacharjee@imec.be