Document Type


Publication Date



Electronic medical records (EMR) and treatment plans are used in research on patient outcomes and radiation effects. In many situations researchers must remove protected health information (PHI) from EMRs. The literature contains several studies describing the anonymization of generic Digital Imaging and Communication in Medicine (DICOM) files and DICOM image sets but no publications were found that discuss the anonymization of DICOM radiation therapy plans, a key component of an EMR in a cancer clinic. In addition to this we were unable to find a commercial software tool that met the minimum requirements for anonymization and preservation of data integrity for radiation therapy research. The purpose of this study was to develop a prototype software code to meet the requirements for the anonymization of radiation therapy treatment plans and to develop a way to validate that code and demonstrate that it properly anonymized treatment plans and preserved data integrity. We extended an open-source code to process all relevant PHI and to allow for the automatic anonymization of multiple EMRs. The prototype code successfully anonymized multiple treatment plans in less than 1. min/patient. We also tested commercial optical character recognition (OCR) algorithms for the detection of burned-in text on the images, but they were unable to reliably recognize text. In addition, we developed and tested an image filtering algorithm that allowed us to isolate and redact alpha-numeric text from a test radiograph. Validation tests verified that PHI was anonymized and data integrity, such as the relationship between DICOM unique identifiers (UID) was preserved. © 2014 Elsevier Ltd.

Publication Source (Journal or Book title)

Computers in Biology and Medicine

First Page


Last Page