Repositorio Dspace

An automated process for supporting decisions in clustering-based data analysis

Mostrar el registro sencillo del ítem

dc.contributor.author Bernabé-Díaz, José-Antonio
dc.contributor.author Franco, Manuel
dc.contributor.author Vivo, Juana-María
dc.contributor.author Quesada-Martínez, Manuel
dc.contributor.author Fernández-Breis, Jesualdo-T
dc.date.accessioned 2025-05-06T10:39:42Z
dc.date.available 2025-05-06T10:39:42Z
dc.date.issued 2022
dc.identifier.citation Bernabé-Díaz JA, Franco M, Vivo J-M, Quesada-Martínez M, Fernández-Breis JT. An automated process for supporting decisions in clustering-based data analysis. Comput Methods Programs Biomed. junio de 2022;219:106765.
dc.identifier.issn 1872-7565
dc.identifier.uri https://sms.carm.es/ricsmur/handle/123456789/18789
dc.description.abstract BACKGROUND AND OBJECTIVE: Metrics are commonly used by biomedical researchers and practitioners to measure and evaluate properties of individuals, instruments, models, methods, or datasets. Due to the lack of a standardized validation procedure for a metric, it is assumed that if a metric is appropriate for analyzing a dataset in a certain domain, then it will be appropriate for other datasets in the same domain. However, such generalizability cannot be taken for granted, since the behavior of a metric can vary in different scenarios. The study of such behavior of a metric is the objective of this paper, since it would allow for assessing its reliability before drawing any conclusion about biomedical datasets. METHODS: We present a method to support in evaluating the behavior of quantitative metrics on datasets. Our approach assesses a metric by using clustering-based data analysis, and enhancing the decision-making process in the optimal classification. Our method assesses the metrics by applying two important criteria of the unsupervised classification validation that are calculated on the clusterings generated by the metric, namely stability and goodness of the clusters. The application of our method is facilitated to biomedical researchers by our evaluomeR tool. RESULTS: The analytical power of our methods is shown in the results of the application of our method to analyze (1) the behavior of the impact factor metric for a series of journal categories; (2) which structural metrics provide a better partitioning of the content of a repository of biomedical ontologies, and (3) the heterogeneity sources in effect size metrics of biomedical primary studies. CONCLUSIONS: The use of statistical properties such as stability and goodness of classifications allows for a useful analysis of the behavior of quantitative metrics, which can be used for supporting decisions about which metrics to apply on a certain dataset.
dc.language.iso eng
dc.publisher Elsevier Ireland Ltd
dc.rights Atribución-NoComercial-SinDerivadas 4.0 España
dc.rights.uri https://creativecommons.org/licenses/by-nc-nd/4.0/deed.es *
dc.subject.mesh Benchmarking
dc.subject.mesh Biological Ontologies
dc.subject.mesh Cluster Analysis
dc.subject.mesh Data Analysis
dc.subject.mesh Humans
dc.subject.mesh Reproducibility of Results
dc.title An automated process for supporting decisions in clustering-based data analysis
dc.type info:eu-repo/semantics/article
dc.identifier.pmid 35367914
dc.relation.publisherversion https://dx.doi.org/10.1016/j.cmpb.2022.106765
dc.identifier.doi 10.1016/j.cmpb.2022.106765
dc.journal.title Computer Methods and Programs in Biomedicine
dc.identifier.essn 0169-2607


Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

Atribución-NoComercial-SinDerivadas 4.0 España Excepto si se señala otra cosa, la licencia del ítem se describe como Atribución-NoComercial-SinDerivadas 4.0 España

Buscar en DSpace


Búsqueda avanzada

Listar

Mi cuenta