PhD - Antwerpen | More than two weeks ago
Enhancing the world-view of autonomous systems through multi-modal and beyond line-of-sight collaborative perception
Some breakthroughs have been made in the research of ITS multi-sensor fusion methods to achive reliable and accurate situational awareness. Those developments includes image fusion, point cloud fusion, and image–point cloud fusion. Although a multi-sensor redundant combination design can make up for the insufficiency of a single sensor in perception, reduce the uncertainty of target detection, and enhance the vehicle’s effective perception of surrounding environmental information, the test results in real scenarios are not ideal. For example, the fusion of camera and LiDAR can provide high-resolution image information and reduce the impact of lighting conditions, but the perception effect is still lacking in bad weather and obstacle occlusion.
Therefore, there is a growing interest in collaborative perception to achieving shared situational awareness. As can benefit from broadcasting and receiving perception messages from other sensors on surrounding agents. With the observation from multiple agents, collaborative perception can fundamentally overcome the physical limits of single-agent perception, such as over-the-horizon and occlusion. Such collaborative perception models can be widely applied to practical applications, such as autonomous driving, autonomous shipping, robotics mapping, etc. Many research works have been done in agent-to-agent communication to achieve collaborative perception for shared situational awareness, aiming to enhance the perception field of view and provide a backup plan in case of local sensor failures. Moreover, observe the same scene from different viewpoints that will improve the certainty and robustness of the perceived environment.
We foresee the following objectives:
Objective 1: Develop an efficient and reliable multi-modal fusion collaboration graph for multi-agent perception
This module is driven to "create a unified multi-task multi-modal fusion model to solve various tasks benefiting from characteristics of different modalities?". Multi-modal fusion has been studied extensively in the literature using modern deep learning approaches. In this research, the goal will be to introduce a multi-task multi-modal fusion architecture, build in an end-to-end deep-learning model that can simultaneously learn to perform multiple tasks using graph-based approaches. The methodology will researched and developed in this project is going to utilize intermediate feature vector fusion using graph-based deep learning to fuse feature maps resulting from extracting relevant visual data from multiple surrounding agents. The encoded features are aggregated based on spatial using graph attention networks constructing local fused information. This will reduce redundant computation through the reuse of same feature extractors for multiple resulting tasks.
Objective 2: Enhance multi-agent perception in presence of network latency issues
Previous collaborative perception methods doesn’t consider a realistic communication setting where latency is inevitable. Besides, the varying latency times of various communication channels would cause severe time asynchronous issues. Experimentally, latency issue severely damages the collaborative perception system, resulting in even worse performance than single-agent perception. To tackle the latency issue, this project will propose a latency-aware collaborative perception system, which actively adapts asynchronous perceptual features from multiple agents to the same time stamp, promoting the robustness and effectiveness of collaboration. The main advantage of our latency-aware collaborative perception system is that it’s able to synchronize the collaboration features before aggregation, mitigating the effect caused by latency instead of directly aggregate the received asynchronous features. This objective will be successful if the proposed methods can aggregate broadcast messages dynamically with higher average precision and lower pose error in availability of latency issues while enhancing the STOA of collaborative perception.
Objective 3: Validation by deploying real-world use-case
The research will be validates on benchmark datasets and simulation engine, and towards the later parts of doctorate studies; the research work will move towards the real-life validation with autonomous driving as the validation use case. This project study the problems of 3D Multi-object detection and tracking in a multi-agent setting.
Required background: Engineering Technology, Engineering Science, Computer Science or equivalent
Type of work: 70% modeling/simulation, 20% experimental, 10% literature
Supervisor: Siegfried Mercelis
Co-supervisor: Tom De Schepper
Daily advisor: Ali Anwar
The reference code for this position is 2024-156. Mention this reference code on your application form.