Study Overview
Official Title:
Safety and Workflow Impact of Large Language Model-Assisted Hospital Course Summaries: Protocol for a Randomized, Evaluator-Blinded Non-Inferiority Trial
Status:
NOT_YET_RECRUITING
Status Verified Date:
2026-03
Last Known Status:
None
Delayed Posting:
No
If Stopped, Why?:
Not Stopped
Has Expanded Access:
False
If Expanded Access, NCT#:
N/A
Has Expanded Access, NCT# Status:
N/A
Brief Summary:
Background: Physicians worldwide face an increasing administrative burden that diverts time from direct patient care. Among inpatient documentation tasks, authoring hospital course summaries is particularly time-consuming and critical for safe care transitions. Large language models (LLMs) have shown promise for clinical text generation; however, robust evidence from randomized, evaluator-blinded trials conducted in routine hospital practice remains limited. Objectives: The CLEAN study aims to evaluate whether LLM-assisted, specialistedited generation of hospital course summaries is non-inferior in safety compared with standard clinician-written documentation in routine inpatient care. Secondary objectives include noninferiority assessments of resident-edited and unedited LLMgenerated summaries. Additional objectives are to evaluate summary quality across predefined domains, quantify physician documentation time, assess LLM generation stability, measure clinician adoption following the randomized phase, and examine inter-, intra-observer, and test-retest reliability of expert assessments. Methods: This is a single-centre, double-campus, exploratory randomized controlled non-inferiority trial conducted at a tertiary university hospital. Consecutive hospital discharges across multiple clinical departments are randomized 1:1 to either an LLM-assisted documentation workflow or standard manual authorship. The intervention integrates an on-premise LLM into a parallel hospital information system, generating draft hospital course summaries from complete, uncurated clinical documentation, which physicians may review and edit prior to finalization. Safety, the primary outcome, defined as presence of all important information and absence of incorrect/hallucinated information, is assessed by an adjudication committee blinded to documentation workflow. Secondary outcomes include content validity, workflow efficiency, generation stability, post-trial clinician adoption, and reliability metrics. A total of 786 discharge episodes are required to assess non-inferiority using a predefined margin of 5 percentage points. Ethics and Dissemination: The study will be conducted in accordance with the Declaration of Helsinki, Good Clinical Practice, and the General Data Protection Regulation. A waiver of informed consent is sought due to minimal risk and exclusive use of routine clinical data. Results will be disseminated through peer-reviewed publication and engagement with healthcare stakeholders.
Detailed Description:
The World Health Organization has identified a severe global health workforce crisis, estimating a shortage of approximately 12.7 million physicians worldwide in 2020. Projections indicate that this deficit will continue to worsen in the coming years \[1\]. Furthermore, these shortages occur alongside rising healthcare demand driven by population ageing and the increasing prevalence of multimorbidity. Despite that, administrative burden remains a persistent and systemic challenge across healthcare systems worldwide. Initiatives aimed at reducing physicians' non-clinical workload have achieved only limited success, leading to the continued diversion of clinical expertise away from direct patient care \[2\]. Among administrative responsibilities, the preparation of discharge letters is one of the most time-consuming tasks for physicians caring for hospitalized patients, yet it is also central to ensuring continuity of care after discharge. Despite major time allocation, discharge summaries frequently fail to meet expected standards, with many physicians reporting shortcomings in both their completeness and quality \[3\], \[4\]. The adoption of electronic medical records has prompted efforts to alleviate this burden by transitioning from fully manual authorship toward partial automation, enabling selected sections of discharge letters to be generated automatically. However, the hospital course-the most clinically informative component of the discharge letter-remains almost universally dependent on manual narrative documentation by the treating physician. B. Objectives The primary objective of this study is to evaluate whether LLM-assisted, specialist-edited generation of hospital course summaries is safe for use in routine inpatient care, compared with standard clinician-written documentation. Secondary objectives include evaluating the safety of resident-edited and unedited LLM-generated hospital course summaries. Additional objectives are to assess the quality of LLM-assisted summaries across predefined domains, evaluate the stability of generated outputs, quantify the impact of AI-assisted workflows on physician documentation time, examine clinician adoption of LLM-supported documentation following completion of the randomized phase, and assess the reliability of evaluator ratings. Collectively, these objectives aim to determine the safety and feasibility of integrating LLM-based narrative generation into real-world hospital discharge workflows.
Study Oversight
Has Oversight DMC:
True
Is a FDA Regulated Drug?:
False
Is a FDA Regulated Device?:
False
Is an Unapproved Device?:
None
Is a PPSD?:
None
Is a US Export?:
None
Is an FDA AA801 Violation?: