SYNDERAI – xShare Advancing Realistic Synthetic Data for Europe

|
23/09/25

 

By Dr Kai Heitmann


 

Within the xShare project, HL7 Europe is moving forward with SYNDERAI (Synthetic Data: Examples – Realistic – using AI). It is a deliverable of the xShare Project Work Package “toolbox” (D3.3) that addresses an upcoming need in European health IT: the availability of synthetic, yet clinically plausible data for testing, validation, education, and vendor support in implementing the European Electronic Health Record Exchange Format (EEHRxF).

Unlike purely random test records, SYNDERAI data is rooted in realistic medical backgrounds. Each record mirrors plausible patient demographics, settings, and care pathways across Europe. By combining multiple data generation sources and adding localized context (such as geo-located patients and providers), the project creates datasets that feel “real” while remaining fully artificial and privacy-safe.

A cornerstone of SYNDERAI is stratification, where synthetic patients are embedded into coherent clinical “stories” and even Personas. This mechanism connects diagnoses, findings, therapies, medications etc., making the datasets valuable not only for technical validation but also for training scenarios, demonstrations and mass data scenarios.

The first use case was a synthetic laboratory report (LAB) conformant to the HL7 Europe specification, with subsequent extensions to Hospital Discharge Reports (HDR) and the European (and international) Patient Summary (EPS/IPS). The next steps include expanding to synthetic Imaging Reports, further broadening the applicability of the toolbox.

AI techniques support the generation of clinically expected reference ranges for laboratory values, conclusions and recommendations based on the patient’s story, goals in care plans, and realistic dose information for medication ensuring internal consistency and realism.

Essentially, SYNDERAI applies this kind of “low-dose AI” approach: artificial intelligence is used selectively, to enhance realism in targeted domains, while avoiding opaque “black-box” data generation.

The project has already prepared examples as HL7 FHIR instances:

 

  • over 1,000 synthetic Laboratory Reports, including longitudinal records with multiple reports per patient.

  • around 1,000 synthetic European Patient Summaries
    , providing broad coverage of typical cross-border scenarios.

 

 

A subset of 37 Patient Summaries and 106 Lab Reports is ready for inspection and use now at the subproject website. Visualization of the examples is achieved using the vi7eti.net technique, supported by the related initiative Gravitate Health, ensuring alignment with broader European innovation efforts.

By combining clinical plausibility, European contextualization, and technical accessibility, SYNDERAI offers the community an essential resource for connect-a-thons, vendor validation, education, and research.

Learn more at synderai.net – and join the movement to make European synthetic data not just available, but truly useful.