An important part of making the interaction feel as natural as possible to user is to allow the LLM to remember facts that were not part of the original training but are important to the user (e.g. the name of the user, their job, interests, etc.). This means that the ideal LLM could remember every fact and conversation that it has encountered. However, this is impossible due to computational limitations. Nevertheless, there are different mechanisms to help the LLM remember these facts for a specific user such as short-term memory, long-term memory, RAG, etc.
In this project, we want to extend these approaches by researching continual learning [Wang2024] for large language models. Here, there are three continual learning categories [Shi2024] for LLMs which follow the three main training stages of LLMs.
- Continual Pre-training (CPT)
- Continual Instruction Tuning (CIT)
- Continual Alignment (CA)
In this project, we will work on all three categories but mainly focus on CPT and CIT where the LLMs will learn to update their internal facts, forget out of data information, and improve their tool usage. Additionally, we will do this while also focusing on using a limited amount of computational resources (move towards sustainable AI). One approach to limit the training computational cost is by using adapters (e.g. LoRA [Hu2022]).
The project is composed in two large phases. In the first phase, we will explore continual learning for a single AI agent and explore continual learning the different learning categories while minimizing the required computational resources. One of the first steps will be to use CPT to update the model’s internal facts and forget the out of data information. This is then extended to CIT where continual learning is combined with methods like reinforcement learning (RL) based methods (e.g. Search-R1 [Jin2025]) to improve the performance of the agents in settings where a verifiable reward is available. An important additional consideration is that models that are trained using reinforcement learning have the side effect that they can become overconfident [Kalai2025]. So, when combining RL with continual learning, we need to make sure that the hallucinations due to overconfidence are not increased unintentionally.
The second phase moves from continual learning for a single AI agent towards continual learning for agentic AI with multiple AI agents. Here, we will look at how continual learning within an agentic system affects the performance of the whole system and at how we can leverage continual learning to update multiple models at the same time. Due to our goal to make these continual learning computational efficient, we will explore the relatively new concept of Small Language Models (SLM) [Belcak2025] where small highly focused models are combined to create an efficient and well performing agentic system. Here, there is a need to focus on agentic architectures that are efficient but are also well suited towards continual learning.
This research will be performed with two main research uses cases in mind. First, we will focus on how we can use our research advancements in a media context where we have a need for accurate information that can change rapidly. This provides an ideal setting to explore out advancements in continual learning for single agents and agentic systems. Secondly, our work on efficient agentic architectures where the agents use continual learning to update their models provides a very interesting case for workload estimation. This focus allows us to not only provide research value towards the academic community and the media sector but also to the core interests of imec.
The research of this PhD position will be conducted in the IDLab research group, which is embedded in imec and the University of Antwerp in Belgium.
Imec was founded in 1984 and is today the world’s largest independent research and innovation centre for nanoelectronics and digital technology. Our worldwide team of 6,000 scientists, engineers, and innovators from over 100 countries, is driven by a shared passion to push boundaries. With expertise across nanoelectronics, AI, and digital technologies, our people are at the heart of every breakthrough.
The University of Antwerp is a young, dynamic and forward-thinking university with a strong mission and vision. The university has over 23.000 students and nearly 7000 staff members. The University of Antwerp is one of the top young universities in the world, ranking #12 on the Times Higher Education Young University Ranking in 2024, and #168 overall in 2025.
References:
[Belcak2025] Belcak, P., Heinrich, G., Diao, S., Fu, Y., Dong, X., Muralidharan, S., ... & Molchanov, P. (2025). Small Language Models are the Future of Agentic AI. arXiv preprint arXiv:2506.02153.
[Hu2022] Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., ... & Chen, W. (2022). Lora: Low-rank adaptation of large language models. ICLR, 1(2), 3.
[Jin2025] Jin, B., Zeng, H., Yue, Z., Yoon, J., Arik, S., Wang, D., ... & Han, J. (2025). Search-r1: Training llms to reason and leverage search engines with reinforcement learning. arXiv preprint arXiv:2503.09516.
[Kalai2025] Kalai, A.T., Nachum, O., Vempala, S.S., & Zhang, E. (2025). Why Language Models Hallucinate. arXiv preprint arXiv:2509.04664.
[Shi2024] Shi, H., Xu, Z., Wang, H., Qin, W., Wang, W., Wang, Y., ... & Wang, H. (2024). Continual learning of large language models: A comprehensive survey. ACM Computing Surveys.
[Wang2024] Wang, L., Zhang, X., Su, H., & Zhu, J. (2024). A comprehensive survey of continual learning: Theory, method and application. IEEE transactions on pattern analysis and machine intelligence, 46(8), 5362-5383.