/Continual learning for Large Language Models

Continual learning for Large Language Models

PhD - Antwerpen | More than two weeks ago

Explore how continual learning can be used to improve a large language model with the latest information and learn new tasks.

Apply

Large Languages Models (LLML) are a class of machine learning models that are trained to understand and generate human text. LLMs are built on the transformer architecture and are then trained on a large amount of textual data to be able to predict the next token (part of a word) given a part of the text. Here, they learn the underlaying structure and patterns of our natural language which allows them to generate text using these learned patterns. LLMs have been able to achieve very impressive technical results by scaling this approach to neural networks with billions or even trillions of parameters which are trained on enormous text corpuses. These impressive technological improvements have already made a large impact on our society. They now help us write emails, summarize our meetings or help us understand a difficult topic making them essential parts of our daily lives.

An important part of making the interaction feel as natural as possible to user is to allow the LLM to remember facts that were not part of the original training but are important to the user (e.g. the name of the user, their job, interests, etc.). This means that the ideal LLM could remember every fact and conversation that it has encountered. However, this is impossible due to computational limitations. Nevertheless, there are different mechanisms to help the LLM remember these facts for a specific user such as short-term memory, long-term memory, RAG, etc.

In this project, we want to extend these approaches by researching continual learning [Wang2024] for large language models. Here, there are three continual learning categories [Shi2024] for LLMs which follow the three main training stages of LLMs.

Continual Pre-training (CPT)
Continual Instruction Tuning (CIT)
Continual Alignment (CA)

In this project, we will work on all three categories but mainly focus on CPT and CIT where the LLMs will learn to update their internal facts, forget out of data information, and improve their tool usage. Additionally, we will do this while also focusing on using a limited amount of computational resources (move towards sustainable AI). One approach to limit the training computational cost is by using adapters (e.g. LoRA [Hu2022]).

The project is composed in two large phases. In the first phase, we will explore continual learning for a single AI agent and explore continual learning the different learning categories while minimizing the required computational resources. One of the first steps will be to use CPT to update the model’s internal facts and forget the out of data information. This is then extended to CIT where continual learning is combined with methods like reinforcement learning (RL) based methods (e.g. Search-R1 [Jin2025]) to improve the performance of the agents in settings where a verifiable reward is available. An important additional consideration is that models that are trained using reinforcement learning have the side effect that they can become overconfident [Kalai2025]. So, when combining RL with continual learning, we need to make sure that the hallucinations due to overconfidence are not increased unintentionally.

The second phase moves from continual learning for a single AI agent towards continual learning for agentic AI with multiple AI agents. Here, we will look at how continual learning within an agentic system affects the performance of the whole system and at how we can leverage continual learning to update multiple models at the same time. Due to our goal to make these continual learning computational efficient, we will explore the relatively new concept of Small Language Models (SLM) [Belcak2025] where small highly focused models are combined to create an efficient and well performing agentic system. Here, there is a need to focus on agentic architectures that are efficient but are also well suited towards continual learning.

This research will be performed with two main research uses cases in mind. First, we will focus on how we can use our research advancements in a media context where we have a need for accurate information that can change rapidly. This provides an ideal setting to explore out advancements in continual learning for single agents and agentic systems. Secondly, our work on efficient agentic architectures where the agents use continual learning to update their models provides a very interesting case for workload estimation. This focus allows us to not only provide research value towards the academic community and the media sector but also to the core interests of imec.

The research of this PhD position will be conducted in the IDLab research group, which is embedded in imec and the University of Antwerp in Belgium.

Imec was founded in 1984 and is today the world’s largest independent research and innovation centre for nanoelectronics and digital technology. Our worldwide team of 6,000 scientists, engineers, and innovators from over 100 countries, is driven by a shared passion to push boundaries. With expertise across nanoelectronics, AI, and digital technologies, our people are at the heart of every breakthrough.

The University of Antwerp is a young, dynamic and forward-thinking university with a strong mission and vision. The university has over 23.000 students and nearly 7000 staff members. The University of Antwerp is one of the top young universities in the world, ranking #12 on the Times Higher Education Young University Ranking in 2024, and #168 overall in 2025.

References:

[Belcak2025] Belcak, P., Heinrich, G., Diao, S., Fu, Y., Dong, X., Muralidharan, S., ... & Molchanov, P. (2025). Small Language Models are the Future of Agentic AI. arXiv preprint arXiv:2506.02153.

[Hu2022] Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., ... & Chen, W. (2022). Lora: Low-rank adaptation of large language models. ICLR, 1(2), 3.

[Jin2025] Jin, B., Zeng, H., Yue, Z., Yoon, J., Arik, S., Wang, D., ... & Han, J. (2025). Search-r1: Training llms to reason and leverage search engines with reinforcement learning. arXiv preprint arXiv:2503.09516.

[Kalai2025] Kalai, A.T., Nachum, O., Vempala, S.S., & Zhang, E. (2025). Why Language Models Hallucinate. arXiv preprint arXiv:2509.04664.

[Shi2024] Shi, H., Xu, Z., Wang, H., Qin, W., Wang, W., Wang, Y., ... & Wang, H. (2024). Continual learning of large language models: A comprehensive survey. ACM Computing Surveys.

[Wang2024] Wang, L., Zhang, X., Su, H., & Zhu, J. (2024). A comprehensive survey of continual learning: Theory, method and application. IEEE transactions on pattern analysis and machine intelligence, 46(8), 5362-5383.

Required background: Engineering Technology, Engineering Science, Computer Science or equivalent Experience with Artificial Intelligence/Machine Learning/Deep Learning/Large Language Models/Reinforcement Learning

Type of work: 50% modeling, 30% experimental, 10% theoretical work, 10% literature

Supervisor: Kevin Mets

Daily advisor: Simon Vanneste

The reference code for this position is 2026-105. Mention this reference code on your application form.

Apply

Who we are

Accept analytics-cookies to view this content.

imec's cleanroom

Accept analytics-cookies to view this content.

Related jobs

Senior Software Engineer for Discrete Event Simulations

Are you passionate about building advanced software solutions that empower world-class research and industrial innovation? Do you thrive on solving complex challenges at the intersection of technology and real-world processes? If so, you might be the Senior Software Engineer for

ICT Solution architect data platforms

Technolegal De-risking of Probabilistic AI Algorithms with Confidential Data Flows

How can AI safely unlock value-driven insights from data that businesses will never have?

Share this article on

Continual learning for Large Language Models

Who we are

imec's cleanroom

Related jobs

Senior Software Engineer for Discrete Event Simulations

ICT Solution architect data platforms

Technolegal De-risking of Probabilistic AI Algorithms with Confidential Data Flows

Modelling trust in conversational AI systems.

Efficient modular architectures for speech and natural language processing

Engineering modular circuit architectures of in vitro neuronal networks

Send this job to your email

Expertise

What we offer

Applications

Jobs

About imec

More imec