PhD - Leuven | About a week ago
Reinforcement learning (RL) is a key AI paradigm for the development of truly autonomous systems. It allows learning controllers to determine optimal policies from trial-and error interactions with the controlled system. Over the last few years, reinforcement learning, in combination with deep neural networks, has been shown to be an extremely powerful learning method. This was demonstrated by the well-publicized victory of the AlphaGo program over human champion Go players, an AI feat which was thought to still be years away. Other success stories have seen deep reinforcement learning algorithms reach human level performance in the StarCraft computer game, learn control of complex robotic systems and automate climate control of data centers.
While current results are impressive, the deployment of reinforcement learning on autonomous systems like mobile robots still faces major hurdles. State-of-art RL controllers rely on massive amounts compute power for both training and online decision making. This computational demand comes at a large environmental and economic cost. Moreover, relying on large, distributed compute systems prohibits the deployment of reinforcement learning in autonomous systems that have limited amounts of both compute and energy. One possible solution is to rely on custom machine learning accelerators that offer more efficient computation. Novel compute paradigms, such as compute-in-memory approaches, promise to improve energy efficiency by orders of magnitudes, while still allowing for high throughput. Running RL on these accelerators, however, will require changes at an algorithmic level.
The focus of this PhD will be the development of deep reinforcement learning algorithms for limited precision hardware. Initially, research will focus on the deployment of trained control policies in limited precision settings. This will require running the policies using low bit-width calculations and training them to be robust to the possible errors this loss of precision introduces. In later stages, research will move to model-based RL approaches that use predictive models to directly compute optimal policies on the low-precision hardware. Other opportunities that can be explored are, among others, the use of graph neural networks, hierarchical reinforcement learning, or meta reinforcement learning. These techniques have shown great potential in grasping relations in the environment, better generalization, and faster task learning. These elements can strongly benefit performance of reinforcement learning algorithms in a context with limited precision hardware.
The PhD research will take place at Leuven in collaboration with Antwerp.
is a world-leading research and innovation hub in nanoelectronics and digital technologies. The machine learning program at is leading the quest for computationally- and energy-efficient machine learning accelerators. By leveraging its memory technology, aims to develop analog compute-in-memory (ACiM) solutions built on emerging non-volatile memory devices. These devices can mitigate the challenges related to learning algorithms, by performing the computations in the memory itself. Compared to classical Von Neumann architectures, in which computations are performed on a central processor after memory elements have been fetched from outside, compute-in-memory approaches have the promise to increase energy efficiency by orders of magnitudes, while at the same time allowing for the required high throughput. machine learning research is driving the co-evolution of hardware and algorithms needed to facilitate the move to this new computational paradigm
Antwerp is a core research group of that has initiated different AI research lines over the past years, building on experience from its first, highly successful AI projects including a prize in a DARPA challenge on spectrum management. These research lines include, among others, work on reinforcement learning, embodied AI, and resource-aware AI. Within the Flanders AI research program, Antwerp is involved in the second challenge focusing on edge and tiny AI, while also leading the 4th challenge on Human-like AI.
Required background: Computer Science, Machine Learning
Type of work: 40% algorithm design, 40% experimental, 20% literature
Supervisor: Steven Latré
Daily advisor: Peter Vrancx
The reference code for this position is 2021-066. Mention this reference code on your application form.