Resumen:
Clustering is an essential tool in biomedical research, often used to identify patterns and subgroups within complex, high-dimensional datasets, such as gene expression profiles, metabolomics, and patient stratification data. However, searching the optimal number of clusters and other input parameters such as trimmed and sparse represent challenging tasks. Traditional clustering methods may struggle to handle noisy, outliers, redundancy, and high-dimensional data, which are common in biomedical applications, leading to unreliable or biologically uninterpretable results. Sparse clustering methods help by emphasizing significant features while suppressing noise, and trimmed clustering can enhance robustness by excluding outliers. Yet, existing approaches often require manual tuning of parameters, such as the trimming proportion, and the sparsity level, which can be time-consuming and based on a trial-and-error approach. To address these limitations, this work presents an automated trimmed and sparse clustering method, which automatically determines both the optimal number of clusters and the necessary tuning parameters. Our method has been made available to the biomedical community through the evaluomeR package, which enables researchers to efficiently implement sophisticated clustering without extensive computational background. This advancement not only increases the usability of trimmed and sparse clustering, but also promotes reproducibility and accuracy in data-driven biomedical discoveries.