At a glance
ClinicalIndex Comparison RecordStandardized by ClinicalIndex from the ClinicalTrials.gov record · verify against the source.
A Prospective, Cross-Sectional, Vignette-Based Observational Study Comparing Clinical Decision-Making Performance of Pediatriciansand AI Models
In Brief
An observational study evaluating AI Suggestions (Anonymized 5-tool panel) and Confidence Rating Task (1-10 Likert) for Artificial Intelligence (AI) in Diagnosis and 3 related conditions. Completed, enrolled 30 participants across 1 site.
Detailed Summary
This study evaluates how well anonymized artificial-intelligence (AI) tools perform on standardized pediatric case vignettes and whether showing AI suggestions can improve clinicians' answers. About 30 board-certified/eligible pediatric specialists at a single hospital complete a one-time session. Participants are randomized to two groups. Group A (n≈15): physicians answer each vignette once. Group B (n≈15): physicians answer and rate confidence (1-10), then review anonymized suggestions from five different AI tools (tool names not shown) and may keep or change their answer; changes and confidence are recorded. Primary focus: measure AI performance (diagnostic accuracy, medication-dosing accuracy, interpretation accuracy) overall and by difficulty tier, and record AI response time. Secondary focus: quantify how AI suggestions affect human performance (change in accuracy, direction of change, confidence shift, and time). No patients or biospecimens are involved; risks are minimal (time and possible discomfort with performance review). Findings may inform safe, evidence-based ways to use AI alongside clinicians in pediatrics.
Study Details
Timeline
Interventions
What: Display of AI-generated suggestions for each vignette, aggregated from five large language model tools (names not shown to participants). When/Who: Shown only in Group 2, after the physician's initial answer and confidence score. Purpose: Measure AI performance (primary) and quantify the effect of AI suggestions on physicians' answers (secondary). Applies to: Group 2.
What: Self-rated confidence for the initial answer on a 1-10 scale. When/Who: Group 2 before viewing AI suggestions. Purpose: Quantify confidence changes pre- vs post-AI and relate confidence to correctness. Applies to: Group 2.