/Student project: Synthesizing gastro-intestinal biomarker profiles for modeling gut health

Student project: Synthesizing gastro-intestinal biomarker profiles for modeling gut health

Research & development - Wageningen | Just now

Student project: Synthesizing gastro-intestinal biomarker profiles for modeling gut health

Creating synthesis datasets of biomarker profiles along the GI tract, accelerating research and development of computational models for monitoring and prediction of gut health.

What you will do

Up to 40% of the population suffers from some form of gastrointestinal (GI) disease. The physical symptoms of GI disorders are often disruptive to daily life, and through the gut-brain axis, mental health can also be affected. Further aggravating this profound societal impact, the diagnosis, treatment and management of GI disorders is complicated by the relative inaccessibility of the GI tract. To help solve this problem, at OnePlanet Research Center we are developing personalized models of gut health that can transform multimodal sensor data, from both the GI tract and other relevant biomarkers, into a continuously updated “digital twin” of a person’s GI health status. A key step in this process is the generation of synthetic datasets. These can be used to train models and AI algorithms, compensate for the scarcity of real, in-vivo measurements, reduce biases (e.g., underrepresentation of certain patient profiles), and enable data to be shared with third parties while safeguarding privacy. Your task in this project will be to expand and improve an existing statistical model of gut biomarkers, estimate population distributions over the model parameters on an existing dataset, and then use the model to generate new data conditioned on samples of “synthetic individuals” from these distributions. Guided by recent literature on generative AI for data synthesis, you will also design tests to assess the quality and usefulness of the resulting synthetic data. Time permitting, a capstone to the project could be training a deep neural network on a large synthetic dataset, to automatically annotate timeseries data from the ingestible. 

In short, the internship involves:
- Working with unique data from cutting-edge sensors
- Extending an existing statistical model of gut biomarker profiles
- Synthesizing new data by sampling from this model
- Designing tests to evaluate the quality and usefulness of the resulting synthetic data
- Working in an agile setting to deliver timely, effective results.

The internship work and activities will be organized with a scrum-like methodology: you will maintain the backlog in coordination with your mentors. You will select prioritized tasks from the backlog, and you will tackle and evaluate them on a biweekly basis. At the end of each biweekly iteration, you will showcase and reflect on your progress in a regular meeting of the Human Digital Twin team (in which you will be embedded), which will be a valuable opportunity for broader feedback and collaborative problem-solving. In addition to this team, you will also be able to draw on the expertise of other data scientists and domain experts working at (or connected to) OnePlanet. 

The ideal starting date for this (6-month) internship would be around the beginning of September (2025).

What we do for you

  • We have a diverse team of experts both on the data and biomedical components to supervise and support you.
  • We have a challenging problem where you have freedom to help developing it into a specific direction.
  • You will join the Digital Twin team of OnePlanet, which employs state-of-the-art knowledge on machine learning for precision medicine.
  • You will be able to exchange views and knowledge with the OnePlanet and Imec community of experts and scientists, widening your professional network.
  • At OnePlanet we embrace diversity and thus give equal opportunities to intern candidates with diverse backgrounds.  

Who you are

  • MSc student in (Applied) Statistics, Applied Math, Data Science, AI, or a similar field
  • Strong foundational understanding of statistics (especially probability distributions)
  • Affinity with Data Science/AI research, and the life sciences
  • Organized and communicative
  • Good programming skills in Python
  • Plus: experience with Monte Carlo Markov Chain (MCMC) methods
  • Plus: experience with Bayesian statistics 

Interested

Does this position sound like an interesting next step in your career at imec? Don’t hesitate to submit your application by clicking on ‘APPLY NOW’.
Should you have more questions about the job, you can contact jobs@imec.nl.

Who we are
Accept marketing-cookies to view this content.
imec's cleanroom
Accept marketing-cookies to view this content.

Send this job to your email