Research Group

This section presents the Translational Data Science Lab's current members, former members, and a chronological list of completed Ph.D. dissertations. Please visit the Translational Data Science Lab website for much more information on our transdisciplinairy research.

Current team [ ^ ]

Postdoctoral members

2023-present: M. Haas (LUMC)
Marcel is a fulltime assistant professor in Health Data Science in the TDS Lab at the Health Campus The Hague. He obtained his PhD from Leiden University and started with a decade of data science experience in industry. Previous employers include DSW and ORTEC.
2023-present: A. Lefebvre (LUMC/LIACS)
Armel is a postdoc in the TDS Lab with expertise on research data management which is highly relevant for ELAN's further development at the Health Campus The Hague, and for LUMC's data strategy in general. Armel combines FAIR principles with Reproducible AI, and data management practices. Previously he worked at Erasmus University in Rotterdam and Tsinghua University in Beijing as a postdoc.
2024-present: B. van Dijk (LUMC)
Bram is a postdoc at the intersection of NLP and ML, with a focus on large language models for open information extraction and synthetic data generation in healthcare applications within the INSAFEDARE project.

Ph.D. candidates

2018-2024: F. van Dijk: Privacy-by-Design (UU)
Friso's research focuses on how organisations can demonstrate the responsible use of personal data in information systems through Privacy-by-Design to implement an effective data governance strategy (Funded by P&O Rijk). Supervisors: S. Brinkkemper (UU), M. Spruit, M. Brinkhuis (UU).
2020-2024: E. Rijcken: Dutch NLP in Mental Healthcare (TUE)
Emil's research is embedded within the COVIDA programme on NLP for Dutch Mental Healthcare and explores how we can make text classification more interpretable and extend our topic modeling knowledge simultaneously, including by extracting semantic meaning from the dimensions withinin dense continuous word embeddings (Funded by Utrecht-Eindhoven Alliance Fund). Supervisors: U. Kaymak (TUE), F. Scheepers (UMCU), M. Spruit.
2015-2024: Z. Shen: Prescriptive analytics in secondary care (LIACS)
Ian's OPERAM WP2 work steered the development of Healthcare Information Systems (HIS) in general, by designing and developing a number of HISs with various Machine Learning (ML) and Natural Language Processing (NLP) techniques that address different issues in healthcare with Open Source methodology (Funded by Horizon2020). Supervisors: M. Spruit (LIACS), S. Brinkkemper (UU).
2021-2025: S. Alfaraj: Prediction of Type II Diabetes Progression (LUMC)
Sukainah's research reuses routinely collected data from the GP office (ELAN-GP) to create clinical decision support to identify disease progression risk levels in Type Two Diabetes Mellitus (T2DM) patients. Supervisors: R. Groenwold (LUMC/EPI), M. Spruit (LUMC), D. Mook (LUMC/PHEG).
2022-2026: E. Roorda: Population Health Analytics (LUMC)
Els' research focuses on maturity modelling for situational data infrastructure and scenario planning towards appropriate regional intelligence (Funded by Q-Consult Zorg). Supervisors: M. Spruit (LUMC), M. Bruijnzeels (LUMC/PHEG), J. Struijs (LUMC/PHEG).
2022-2026: S. Samir Khalil: Federated NLP in Mental Healthcare (LIACS)
Samar's research focuses on how current NLP techniques can be applied and extended to support mental health detection and promotion, through collection and analysis of textual resources with multilingual, multimodal and federated techniques (Funded by AAST). Supervisors: M. Spruit (LIACS), N. Tawfik (AAST).
2023-2027: H. Muizelaar: Dutch NLP and ML for Risk Stratification (LUMC)
Hielke's research in the HealthBox and ECOTIP projects is to develop NLP/ML-based Patient Segmentation and Risk Prediction models based on EHR, environmental, social and mobility data. Supervisors: M. Spruit (LUMC), M. Haas (LUMC).
2023-2027: J. Achterberg: Synthetic data generation and evaluation for HTAs (LUMC)
Jim's research in the INSAFEDARE project revolves around the generation and evaluation of a benchmarking synthetic dataset amenable to regulatory processes, and analytical ML methods for the validation of digital health applications. Supervisors: M. Spruit (LUMC), M. Haas (LUMC), R. Vos (LUMC).

Completed dissertations [ ^ ]

13 Nov 2012: W. Bekkers, Ph.D.: Situational Process Improvement in Software Product Management
Willem's dissertation investigates how software product management (SPM) practices can be improved in a situational manner. The first part presents an overview of all practices that constitute SPM in the SPM competence model and the SPM maturity matrix. Then, the situational factors that affect SPM in the situational factor effects catalog are defined. The final part presents the situational assessment method (SAM) which software product management organizations can assess and improve their SPM in a situational manner. Supervisors: S. Brinkkemper (UU), M. Spruit. Funded by: Centric IT BV. dspace.library.uu.nl/handle/1874/256455
13 Jan 2016: M. Meulendijk, Ph.D.: Optimizing medication reviews through decision support: prescribing a better pill to swallow
Michiel' s dissertation investigates the conception and development of a decision support system to facilitate the conduct of structured medication reviews by physicians and pharmacists in primary care. The resulting STRIP Assistant system is validated in both a controlled environment and in daily practice, and is shown to significantly improve practitioners' effectiveness and efficiency in optimizing medication. This work deepens our understanding of barriers currently impeding the utility of decision support systems in primary care, most notably those of semantic interoperability and safe application of association rule mining. Supervisors: S. Brinkkemper (UU), M. Numans (LUMC), M. Spruit, P. Jansen (UMCU). Funded by: UMCU/UU. dspace.library.uu.nl/handle/1874/328063
20 March 2019: S. Syed, Ph.D.: Topic Discovery from Textual Data: Machine Learning and Natural Language Processing for Knowledge Discovery in the Fisheries Domain
Shaheen' s dissertation investigates how to optimally and efficiently apply and interpret probabilistic topic models to large collections of documents such as scientific publications. This work shows how different types of textual data, pre-processing steps, and hyper-parameter settings can affect the quality of the derived latent topics, using the Latent Dirichlet Allocation approach in particular. Supervisors: S. Brinkkemper (UU), M. Spruit. Funded by: Horizon2020 Marie Sklodowska-Curie MSC-ITN-ETN. dspace.library.uu.nl/handle/1874/374917
2 October 2019: V. Menger, Ph.D.: Knowledge Discovery in Clinical Psychiatry: Learning from Electronic Patient Records
Vincent's dissertation investigates how data from Electronic Health Records can provide relevant insights for psychiatric care. The first three chapters identify key technical, organizational and ethical challenges related to knowledge discovery in EHRs. The next three chapters focus on the knowledge discovery processing by employing natural language processing and cluster ensembling techniques to EHR data to obtain new insights with potential to improve care. Supervisors: S. Brinkkemper (UU), F. Scheepers (UMCU), M. Spruit. Funded by: UMCU. NB: Best departmental Dissertation award 2020. dspace.library.uu.nl/handle/1874/385129
14 October 2020: W. Omta: Knowledge Discovery in High Content Screening
Wienand's research investigates how multi-parametric data analysis can contribute to effective knowledge discovery in High Content Screening. His HC StratoMineR analytic system is designed and validated based on unsupervised data analysis methods. Gains and losses of using supervised data analytics methods and interactive visualizations are quantified. A standard data preprocessing pipeline is implemented in an R package, and a laboratory practice application of the systems to a chemical screen demonstrates this research's utility. Supervisors: S. Brinkkemper (UU), J. Klumperman (UMCU), M. Spruit. Funded by: UMCU/UU. dspace.library.uu.nl/handle/1874/399883
24 November 2020: N. Tawfik: Text Mining for Precision Medicine: Machine Learning and Information Extraction for Knowledge Discovery in the Health Domain
Noha's research investigates how biomedical natural language processing (BioNLP) can support and advance the Precision Medicine (PM) approach through collection and analysis of clinical and medical textual resources. The first two chapters contribute to the PM domain by obtaining valuable knowledge from unstructured resources. The other five chapters apply state-of-the-art NLP techniques to multiple data sources in order to better support the PM concept. This work focuses on combining traditional machine learning with deep learning techniques for the Natural Language Inference task, among others. Supervisors: S. Brinkkemper (UU), M. Spruit. Funded by: Arab Academy for Science, Technology & Maritime Transport (AAST). dspace.library.uu.nl/handle/1874/400797
15 March 2021: A. Levebfre: Research data management for open science
Armel's research investigates investigates research data management practices in laboratories in the context of open science. It discusses organizational and technological issues among stakeholders involved in research data management. Then, elaborates on the concept of reproducibility in experimental science. Finally, it illustrates several applications of FAIR technology and proposes a strategy for open science readiness. Supervisors: S. Brinkkemper (UU), B. Snel (UU), M. Spruit, B. van Breukelen (UU). Funded by: UU/ITS. dspace.library.uu.nl/handle/1874/401610
11 July 2022: B. Yigit Ozkan: Cybersecurity Maturity Assessment and Standardisation
Bilge's research investigates how we can integrate cybersecurity maturity assessment and cybersecurity standardisation to provide tailored support for organisations in their cybersecurity improvement efforts. Her work was carried out in the context of the SMESEC project. Supervisors: S. Brinkkemper (UU), M. Spruit (LUMC/LIACS). Funded by: SMESEC, Horizon 2020 - H2020-DS-2016-2017, grant #740787. dspace.library.uu.nl/handle/1874/421285
5 June 2023: I. Sarhan: Open Information Extraction for Knowledge Representation
Ingy's research focuses on a systematic methodology that explores various Machine Learning (ML) and Natural Language Processing (NLP) algorithms to extract vital information from unstructured textual data to construct an effective representation of the mined information. Supervisors: S. Brinkkemper (UU), M. Spruit. Funded by: AAST and GEIGER, Horizon 2020 grant #883588. dspace.library.uu.nl/handle/1874/428396
6 October 2023: A. Shojaifar: Volitional Cybersecurity
Alireza's work took place within the SMESEC and GEIGER EU projects. He co-designed and researched an automated cybersecurity assessment platform named Cybersecurity Coach (CySEC) which integrates personalised assessments, web usage behaviour, and advice adherence modelling, specifically for SMEs. Supervisors: S. Brinkkemper (UU), M. Spruit, S. Fricker (FHNW). Funded by: SMESEC and GEIGER, i.e. Horizon2020 projects #740787 and #883588. dspace.library.uu.nl/handle/1874/431418
Date set! 19 Jan 2025: B. van Dijk: Theory of Mind through the Lens of Language: a Multidisciplinary Approach
Bram's dissertation intersects computational linguistics and NLP, investigating the relation between Theory of Mind (ToM) and natural language and cognition, as well as with Large Language Models as computational models of cognition. Supervisors: M. Spruit, M. van Duijn (ULEI/LIACS). Funded by: NWO/Veni.
Date set! 24 Jan 2025: M. van Haastrecht: Transdisciplinary Perspectives on Validity: bridging the gap between design and implementation for technology-enhanced learning systems
Max's dissertation forges a new transdisciplinary path towards holistic Technology-Enhanced-Learning validation that aids accelerated, but also responsible and trustworthy, impact. Supervisors: M. Spruit, M. Brinkhuis (UU). Funded by: GEIGER, Horizon 2020 - SU-DS03-2019-2020, grant #883588.

Former members [ ^ ]