Viewing Study NCT07414966


Ignite Creation Date: 2026-03-26 @ 3:15 PM
Ignite Modification Date: 2026-03-30 @ 2:59 AM
Study NCT ID: NCT07414966
Status: NOT_YET_RECRUITING
Last Update Posted: 2026-02-17
First Post: 2026-02-09
Is NOT Gene Therapy: True
Has Adverse Events: False

Brief Title: Scalable Clinical Oversight of Large Language Models Via Uncertainty Triangulation
Sponsor: China National Center for Cardiovascular Diseases
Organization:

Study Overview

Official Title: Prospective Evaluation of a Model-Agnostic Meta-Verification Framework (SCOUT) for Scalable Clinical Oversight of Large Language Model Outputs in Coronary Heart Disease Diagnosis: A Multi-Reader, Randomized, Crossover Trial
Status: NOT_YET_RECRUITING
Status Verified Date: 2026-02
Last Known Status: None
Delayed Posting: No
If Stopped, Why?: Not Stopped
Has Expanded Access: False
If Expanded Access, NCT#: N/A
Has Expanded Access, NCT# Status: N/A
Acronym: SCOUT
Brief Summary: This prospective, multi-reader, randomized crossover trial evaluates SCOUT (Scalable Clinical Oversight via Uncertainty Triangulation), a model-agnostic meta-verification framework that selectively defers unreliable large language model (LLM) predictions to clinicians by triangulating three orthogonal uncertainty signals: model heterogeneity, stochastic inconsistency, and reasoning critique. The trial assesses whether SCOUT-assisted review can reduce physician review time compared with standard manual review of AI-generated diagnoses while maintaining non-inferior diagnostic accuracy in coronary heart disease (CHD) subtyping.
Detailed Description: Background: Large language models are increasingly deployed in clinical workflows, yet requiring clinician review of every AI output negates the efficiency gains that motivate their adoption. SCOUT addresses this efficiency-safety paradox through algorithmic meta-verification.

The SCOUT framework triangulates three orthogonal external signals to determine case-level uncertainty: (1) Model Heterogeneity - whether a structurally different auxiliary LLM agrees with the primary model; (2) Stochastic Inconsistency - whether repeated sampling from the same model yields divergent outputs; (3) Reasoning Critique - whether an external checker model identifies logical flaws in the chain-of-thought reasoning.

In this crossover trial, 7 clinicians of varying seniority (2 junior residents, 3 senior residents, 2 attending physicians) each review all 110 cases under both standard manual review and SCOUT-assisted review workflows. The study evaluates workflow efficiency (primary endpoint) and diagnostic accuracy (secondary endpoint).

Study Oversight

Has Oversight DMC: True
Is a FDA Regulated Drug?: False
Is a FDA Regulated Device?: False
Is an Unapproved Device?: None
Is a PPSD?: None
Is a US Export?: None
Is an FDA AA801 Violation?: