System Architecture exploration for AI accelerator platforms

Leuven - PhD
|
More than two weeks ago

Explore the trade-offs between power, performance and accuracy of AI and Neural Network accelerators, finding the right balance in a perfect storm of extreme memory and compute requirements and flexibility.

Apply

Convolutional and Deep Neural Networks have received much attention and investment from the research community as well as industry in recent years owing to the highly accurate performance in certain classes of machine perception tasks. In recent years a wide variety of dedicated accelerators have been designed, both in academia as well as in industry (Google TPU, Microsoft Brainwave, Graphcore, Cerebras to name just a few). This coupled with the ever-increasing demand for smart systems is driving the need for continuous improvement in performance while requiring the technology to be cheaper and energy-efficient, both for cloud based accelerators and for portable/nomadic applications.

accelerator architecture

Tremendous growth in algorithm complexity driving a hardware gold rush to new accelerator architectures and chipsets.
(source: Eric Chung, Microsoft, AI Hardware Summit 2019)

The tremendous growth in algorithm complexity has resulted in a perfect storm of hardware requirements combining extremely large memories and massive compute. At the same time many domain specific languages and frameworks have emerged (tensorflow, pytorch, onnx, cuda etc.) to efficiently implement neural networks on generic CPUs and GPUs and to a lesser extent to specific accelerators. This has given rise to a new and interesting research problems that require pushing the boundaries of classical architecture design paradigms and to co-optimize them together with circuit design and technology implementation. Though the algorithms typically have mostly regular dataflows that do allow massively parallel execution, there is a multitude of options on how to design a system that is to be efficient, performant and at the same time reasonably flexible.

During this PhD you will build architecture models for neural network accelerators and study the mapping of a variety of algorithms on them, to co-optimize architecture, algorithms and eventually circuit design for AI accelerators. The goal is to explore a multitude of architectures: systolic array based, analog/mixed signal and even in-memory compute based will be considered. You will work with that design analog-in-memory compute solutions based on emerging memory technology like MRAM or RRAM but your scope will be wider than just that, to look at a full, programmable accelerator that will need a heterogeneous mix of CPU, digital accelerator logic, next to, possibly, in-memory compute fabrics. The goal is to find the right choices to deal with memory pressure, compute requirements, power efficiency and flexibility. Building the right modelling tools will be an important part of the work, as this will allow for fast and efficient exploration of the design space, and possibly automated network and hardware architecture search based solutions.

 

Required background: lectrical or microelectronic engineer with a strong background in the domain of processor architecture and circuit design concepts and strong interest in novel compute paradigms

Type of work: 70% architecture modeling and exploration, 20% neural network design and optimization, 10% literature

Supervisor: Marian Verhelst

Daily advisor: Peter Debacker

The reference code for this position is 2020-056. Mention this reference code on your application form.
Chinese nationals who wish to apply for the CSC scholarship, should use the following code when applying for this topic: CSC2020-23.

Apply

Share this on

truetrue

This website uses cookies for analytics purposes only without any commercial intent. Find out more here. Our privacy statement can be found here. Some content (videos, iframes, forms,...) on this website will only appear when you have accepted the cookies.

Accept cookies