Repositorio Dspace

A comparison of synthetic data generation and federated analysis for enabling international evaluations of cardiovascular health

Mostrar el registro sencillo del ítem

dc.contributor.author Azizi, Zahra
dc.contributor.author Lindner, Simon
dc.contributor.author Shiba, Yumika
dc.contributor.author Raparelli, Valeria
dc.contributor.author Norris, Colleen-M
dc.contributor.author Kublickiene, Karolina
dc.contributor.author Herrero, María-Trinidad
dc.contributor.author Kautzky-Willer, Alexandra
dc.contributor.author Klimek, Peter
dc.contributor.author Gisinger, Teresa
dc.contributor.author Pilote, Louise
dc.contributor.author El-Emam, Khaled
dc.date.accessioned 2025-10-20T14:38:14Z
dc.date.available 2025-10-20T14:38:14Z
dc.date.issued 17/07/2023
dc.identifier.citation Azizi Z, Lindner S, Shiba Y, Raparelli V, Norris CM, Kublickiene K, et al. A comparison of synthetic data generation and federated analysis for enabling international evaluations of cardiovascular health. Sci Rep. 17 de julio de 2023;13(1):11540.
dc.identifier.issn 2045-2322
dc.identifier.uri https://sms.carm.es/ricsmur/handle/123456789/20466
dc.description.abstract Sharing health data for research purposes across international jurisdictions has been a challenge due to privacy concerns. Two privacy enhancing technologies that can enable such sharing are synthetic data generation (SDG) and federated analysis, but their relative strengths and weaknesses have not been evaluated thus far. In this study we compared SDG with federated analysis to enable such international comparative studies. The objective of the analysis was to assess country-level differences in the role of sex on cardiovascular health (CVH) using a pooled dataset of Canadian and Austrian individuals. The Canadian data was synthesized and sent to the Austrian team for analysis. The utility of the pooled (synthetic Canadian + real Austrian) dataset was evaluated by comparing the regression results from the two approaches. The privacy of the Canadian synthetic data was assessed using a membership disclosure test which showed an F1 score of 0.001, indicating low privacy risk. The outcome variable of interest was CVH, calculated through a modified CANHEART index. The main and interaction effect parameter estimates of the federated and pooled analyses were consistent and directionally the same. It took approximately one month to set up the synthetic data generation platform and generate the synthetic data, whereas it took over 1.5 years to set up the federated analysis system. Synthetic data generation can be an efficient and effective tool for enabling multi-jurisdictional studies while addressing privacy concerns.
dc.language.iso eng
dc.publisher NATURE PORTFOLIO
dc.rights Atribución-NoComercial-SinDerivadas 3.0 España
dc.rights.uri http://creativecommons.org/licenses/by-nc-nd/3.0/es/ *
dc.subject.mesh Humans
dc.subject.mesh Canada
dc.subject.mesh Cardiovascular System
dc.subject.mesh Austria
dc.subject.mesh Disclosure
dc.subject.mesh Privacy
dc.title A comparison of synthetic data generation and federated analysis for enabling international evaluations of cardiovascular health
dc.type info:eu-repo/semantics/article
dc.identifier.pmid 37460705
dc.relation.publisherversion https://dx.doi.org/10.1038/s41598-023-38457-3
dc.type.version info:eu-repo/semantics/publishedVersion
dc.identifier.doi 10.1038/s41598-023-38457-3
dc.journal.title Scientific Reports


Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

Atribución-NoComercial-SinDerivadas 3.0 España Excepto si se señala otra cosa, la licencia del ítem se describe como Atribución-NoComercial-SinDerivadas 3.0 España

Buscar en DSpace


Búsqueda avanzada

Listar

Mi cuenta