The generative AI revolution we’ve seen so far — from chatbots to different sorts of content generators (video, music, legal documents, code: you name it) — is just the opening act. The real transformation will come with agentic and physical AI: systems that don’t just think and generate, but act in the real world. We’re talking about AI that can make autonomous decisions and execute them, manipulate physical objects, and coordinate with other agents. In short, machines that don’t just process the world, but participate in it, machines that don’t just talk, but also act. But before these next-gen systems can become widespread, there’s a towering challenge: the hardware has to keep up.
What is agentic and physical AI?
Agentic AI consists of software agents that show proactive behavior, they can take goal-driven actions in digital environments. Think AI that can manage your inbox, generate personalized e-mail responses, place purchase orders and edit contracts, very much like an employee would.
Physical AI refers to embodied systems that interact with the physical world. Warehouse robots that can adapt on the fly and collaborate with other robots or humans, home assistants that clean, cook, and care, or autonomous vehicles. They all ‘feel’ through a myriad of sensors and they interact swiftly with the physical world.
Gartner: watch out for agent washing
Disclaimer: there is a lot of hype around AI agents lately. In June 2025, market analyst Gartner warned for ‘agent washing’, the rebranding of existing products, such as AI assistants, standard robotic process automation (RPA) and chatbots, all without substantial agentic AI capabilities. Gartner also predicted over 40% of agentic AI projects will be canceled by the end of 2027, due to escalating costs, unclear business value or inadequate risk controls.
What’s next-gen about agentic and physical AI ?
Large Language models generate content, be it text, images, music, video – based on a prompt and a very large model of our language and human knowledge. It’s up to the user to decide whether to use the output or not. Next-gen AI will automate complex, real-world workflows — beyond repetitive tasks, and the applications will – certainly for physical AI – expand into domains like logistics, healthcare, agriculture, and construction — fields where physical presence and real-time decision-making are key.
For that purpose these next-gen AI-systems use ‘large behaviour models’ (LBM’s). Their input may be an image and a prompt, but unlike large language models, the output consists of actions rather than text. As a matter of fact, you could still consider the output to be ‘text’, since these actions are instigated by code, for instance to move a robotic joint from position x to y. However, the correct execution of such an action requires an up-to-date model of the dynamics of the physical environment.
Physically based data generation starts with a digital twin of a space, such as a factory. In this virtual space, sensors and autonomous machines like robots are added. Simulations that mimic real-world scenarios are performed, and the sensors capture various interactions. The better the real world data collection, the better the simulations reflect the real world.
At the end of the day, these AI-driven robots will have to collaborate with humans and other agents to perform complex tasks, sharing context, and intent. These multi-agent situations pose big challenges.
Real-world risks and predefined guardrails. What could possibly go wrong? In the case of LLM’s, we all encountered at some point the problem of hallucinations. Models generate information that is vastly incorrect, or pictures seem to escape the laws of physics and biology. As a user, you can decide not to use the result of your prompt. With agentic and physical AI, hallucinations have real-world implications and risks.
Suppose an agent is independently placing wrong purchase orders for your company, or starts cancelling important contracts: it could damage the company pretty badly. Or, in the case of embodied AI like heavy human-like robots, the safety risks are critical. By giving actions out of our own hands, next-gen AI could mean we intervene only after harm has been done.
Predefined guardrails can address the risks. Actually, this is comparable with how we prevent LLM’s from spreading hate speech. LLM’s are able to generate this kind of content, but we limit their ability to display these contents. Agentic and physical AI can receive predefined limits in which they can operate, too.
Another interesting option is to install an agent dedicated with a supervising task. But even then it should be clarified who holds responsibility.
Learn and collaborate like humans do
Humans are highly efficient when it comes to performing tasks in the physical world, and moving around with physical objects. By trying, failing and trying again, children ‘learn by experience’. Each attempt teaches them something new – this is what we call ‘reinforcement learning’. Additionally, children learn from demonstration, copying tasks performed by adults. Along the way, children develop a ‘model of the world’ in their heads that helps predict the result of a specific action and allows to plan tasks successively.
However, it takes many years to develop such a model and master the skills to plan and execute typical human tasks. We don’t allow robots to take that much time for learning, nor do we allow them to make as many mistakes. It means they rely purely on simulation environments like NVIDIA’s Omniverse to figure out a ‘world model’. This generates a massive computational load, both during training and during execution.
This compute load rises exponentially with every agent (be it another robot, or a human) that is added, since the robots will have to deal with far more uncertainties about what the other agent is going to do, and thus far more options to calculate during its action planning.
An intriguing question and a hot research topic - is how agents will collaborate in the future. If agents solely take the decisions that are best for their owner, we might miss out on some societal benefits. Take the ‘smart grid’ for instance: probably price incentives won’t be enough to steer energy use on the grid. Coordinating agents, sharing intent, may be able to fix that complex issue.
The hardware challenges
Here’s the rub: all this needs far more than algorithms. Physical and agentic AI run into serious hardware constraints.
Massive memory: to store the parameters of very large world models.
Reliable, deterministic networking: to ensure continuous connectivity and coordination between agents, devices, and cloud systems.
Advanced sensor data fusion: for low-latency decision-making and situational awareness in real time.
AI chips: to execute the compute load in an energy efficient way. for secure, low-latency action-taking, especially in enterprise or privacy-sensitive settings
Battery technology: this is especially critical for mobile platforms and continuous deployment. Hardware-aware AI can help balance performance with energy use.
Actuators: for compliant actuation systems and adaptive control algorithms that balance strength and safety, enabling human-like manipulation – both safe and fast. For that we need co-designed hardware and control algorithms that respond intelligently to sensory feedback. This includes learning-based controllers that adapt to object properties and task demands.
Chips for agentic and physical AI: a matter of flex
Next-gen AI’s workloads become increasingly heterogeneous. Some models require CPUs, some GPUs, and others are currently lacking the right processors. It is clear that a classic one-size-fits-all approach, which only increases compute power, won’t suffice.
Algorithms are changing quickly. But hardware problems are time-intensive, taking several years to achieve even minor improvements. All while production is becoming increasingly complex and therefore expensive. In other words, the tech industry is dealing with a synchronization issue.
Developing a specific computing chip for each model, as we do today for generative AI, can't keep up with the unprecedented pace of innovation in models. Furthermore, the laws of economics are not playing in hardware’s favor either: there is a huge inherent risk of stranded assets because by the time the AI hardware is finally ready, the fast-moving AI software community may have taken a different turn.
It is particularly difficult to predict the next processor requirement, so flexibility will be key in the long run. Silicon hardware should become almost as ‘codable’ as software is. The same set of hardware components should become reconfigurable.
Picture it: rather than one monolithic ‘state-of-the-art’ and super-expensive processor, you would get different coworking supercells consisting of stacked layers of semiconductors, each optimized for specific functionalities, and integrated in 3D so memory can be placed close to the logic processing unit, thereby limiting the energy losses of data traffic.
A network-on-chip will steer and reconfigure these supercells, smartly combining the different versatile building blocks. This way, new emerging memory technologies that are better suited for AI workloads become possible, overcoming current barriers like energy inefficiency or lack of reliability.
AI’s future and success hinges on hardware innovations. Entire sectors could be reimagined: construction, caregiving, transportation, maintenance. And – in the long run- we could even see general-purpose humanoid robots helping us in all sorts of typically human tasks.
The NanoIC project, an extension of the imec pilot line, is the European answer to AI-driven complexity and further strengthens Europe’s leadership in research by bridging the gap from lab to fab. At the same time, the pilot line will foster a European industry ecosystem of start-ups, AI companies, chip designers, manufacturers, and others around the most advanced technology.

Pieter Simoens (1982) is a professor at Ghent University, affiliated to imec. He specializes in distributed artificial intelligent systems. His research focuses, among other things, on the link between robots and the Internet of Things, continuously learning embedded devices and the study of how collective intelligence can arise from the collaboration of individual and autonomous agents.
Published on:
8 September 2025