Viewing Study NCT07427927


Ignite Creation Date: 2026-03-26 @ 3:21 PM
Ignite Modification Date: 2026-04-01 @ 10:35 PM
Study NCT ID: NCT07427927
Status: ACTIVE_NOT_RECRUITING
Last Update Posted: 2026-02-23
First Post: 2026-02-05
Is NOT Gene Therapy: True
Has Adverse Events: False

Brief Title: Data-driven Clustering in Hemorrhoid Surgery: Retrospective Monocentric Study for the Identification of Clinical Phenotypes
Sponsor: IRCCS Policlinico S. Donato
Organization:

Study Overview

Official Title: Data-driven Clustering in Hemorrhoid Surgery: Retrospective Monocentric Study for the Identification of Clinical Phenotypes
Status: ACTIVE_NOT_RECRUITING
Status Verified Date: 2026-01
Last Known Status: None
Delayed Posting: No
If Stopped, Why?: Not Stopped
Has Expanded Access: False
If Expanded Access, NCT#: N/A
Has Expanded Access, NCT# Status: N/A
Acronym: PROCTO-CLUSTER
Brief Summary: This retrospective, single-center observational study will use routinely collected perioperative data from adults undergoing surgery for symptomatic hemorrhoidal disease to identify data-driven clinical phenotypes. Unsupervised machine learning will be applied to characterize clusters of patients based on demographic, clinical, anatomical, and surgical variables. The study will explore whether the resulting phenotypes differ in operative complexity and postoperative course, and will generate hypotheses to inform future predictive models and personalized surgical planning.
Detailed Description: Hemorrhoidal disease presents with heterogeneous symptom patterns, anatomical findings, and operative strategies that are not fully captured by traditional degree-based classifications. This study aims to identify latent, clinically interpretable phenotypes among surgical patients using a fully unsupervised machine learning pipeline applied to routinely collected perioperative data from a high-volume tertiary referral center.

This is a retrospective, observational analysis of de-identified institutional records. The analytic dataset will include routinely documented variables spanning baseline demographics/anthropometrics, symptom profile and relevant clinical history, operative technique and intraoperative descriptors, and routinely captured postoperative follow-up information. Data will be extracted using a predefined data dictionary and standardized preprocessing rules to support reproducibility and reduce variability in variable definitions.

The primary analytic approach will be unsupervised clustering. Variables will be cleaned and standardized prior to modeling. Dimensionality reduction will be performed using t-distributed stochastic neighbor embedding (t-SNE), initialized with principal component analysis to improve stability. Cluster discovery will then be conducted using k-means clustering on the reduced feature space. A range of cluster solutions will be explored, and the final solution will be selected using internal validity metrics (e.g., silhouette-based measures) together with assessment of clinical interpretability. Model robustness will be evaluated through repeated runs across multiple random seeds and key parameter settings to assess stability of cluster assignments.

After cluster assignment, clusters will be characterized using descriptive and comparative statistics to identify variables that most differentiate phenotypes. Post-hoc feature relevance/importance approaches will be used to explore which demographic, clinical, and surgical factors most strongly contribute to cluster formation, with emphasis on effect sizes and clinically meaningful patterns rather than hypothesis-testing alone. Findings will be used to generate hypotheses regarding phenotypes that may be associated with greater operative complexity and different postoperative trajectories, supporting future work on predictive modeling and individualized surgical decision support.

All analyses will be conducted within a controlled institutional environment using validated statistical and data-mining software, with documented parameter settings and version tracking to enable reproducibility. Only de-identified data will be used for analysis, and results will be reported in aggregate to protect patient privacy.

Study Oversight

Has Oversight DMC: False
Is a FDA Regulated Drug?: False
Is a FDA Regulated Device?: False
Is an Unapproved Device?: None
Is a PPSD?: None
Is a US Export?: None
Is an FDA AA801 Violation?: