Semester of Graduation

Summer 2026

Degree

Master of Science (MS)

Department

Physics and Astronomy

Document Type

Thesis

Abstract

Patient safety incident reporting in radiation oncology requires expert analysis that is time-intensive and subject to variability. This thesis presents the development and technical validation of a locally deployed large language model (LLM) system for automated incident report analysis across multiple cancer centers. The system was designed for automated summarization and taxonomy assignment of Radiation Oncology Incident Learning System (RO-ILS) reports, operating entirely on local infrastructure to preserve patient privacy. A two-round, multi-rater evaluation methodology was employed, incorporating 600 total expert evaluations from two academic cancer centers. Round 1 established baseline performance using Mistral 7B and Mixtral 8x7B models with standard prompt engineering. Round 2 incorporated enhanced prompt engineering, retrieval augmented generation (RAG), and advanced models (Gemma3 27B, DeepSeek R1 70B, Llama 3.1 70B). Clinical experts evaluated outputs using 5-point ordinal scales with mixed-effects statistical analysis. Multi-rater analysis demonstrated statistically significant performance improvements between rounds. Institution 1 achieved substantial improvements in summary scores (3:34 +/- 1:17 -> 4:20 +/- 0:84, d = 0:78, p < 0:001) and tag scores (3:28 +/- 0:98 -> 4:32 +/- 0:70, d = 1:11, p < 0:001). Institution 2 showed improvements in summary scores (3:61 +/- 1:24 -> 4:02 +/- 0:97, d = 0:34, p = 0:055) and tag scores (3:79 +/- 1:10 -> 4:40 +/- 0:81, d = 0:58, p < 0:001). Round 2 high-performance thresholds (>= 4) reached 80.0% and 86.7% at Institution 1, and 68.9% and 84.4% at Institution 2, for summaries and tags respectively. Concurrent inter-rater reliability analysis revealed that human expert agreement fell below conventional thresholds (ICC < 0:75) across evaluation dimensions, demonstrating that the manual analysis process itself is characterized by substantial subjectivity. These results establish technical feasibility for LLM-based automation of radiation oncology incident analysis using privacy-preserving local deployment, with effect sizes ranging from small to large (d = 0:34–1:11) across institutions. The low inter-rater reliability among human experts suggests that the consistency offered by automated analysis may represent a distinct advantage for multi-institutional learning systems. Institutional variation, including a non-significant summary improvement at Institution 2 (p = 0:055), suggests that site-specific factors may influence system effectiveness and warrants further investigation.

Date

5-4-2026

Committee Chair

Ara Alexandrian

LSU Acknowledgement

1

LSU Accessibility Acknowledgment

1

Available for download on Sunday, October 04, 2026

Share

COinS