Title

Structural variation and its potential impact on genome instability: Novel discoveries in the EGFR landscape by long-read sequencing

Authors

George W. Cook, Sentry Genomics, Baton Rouge, LA, United States of America.
Michael G. Benton, Department of Chemical Engineering, Louisiana State University, Baton Rouge, LA, United States of America.
Wallace Akerley, Huntsman Cancer Institute, University of Utah School of Medicine, Department of Oncological Sciences, Salt Lake City, UT, United States of America.
George F. Mayhew, Roche Sequencing Solutions, Madison, WI, United States of America.
Cynthia Moehlenkamp, Roche Sequencing Solutions, Madison, WI, United States of America.
Denise Raterman, Roche Sequencing Solutions, Madison, WI, United States of America.
Daniel L. Burgess, Roche Sequencing Solutions, Madison, WI, United States of America.
William J. Rowell, Pacific Biosciences, Menlo Park, CA, United States of America.
Christine Lambert, Pacific Biosciences, Menlo Park, CA, United States of America.
Kevin Eng, Pacific Biosciences, Menlo Park, CA, United States of America.
Jenny Gu, Pacific Biosciences, Menlo Park, CA, United States of America.
Primo Baybayan, Pacific Biosciences, Menlo Park, CA, United States of America.
John T. Fussell, Sentry Genomics, Baton Rouge, LA, United States of America.
Heath D. Herbold, Sentry Genomics, Baton Rouge, LA, United States of America.
John M. O'Shea, Huntsman Cancer Institute, Biorepository Molecular Pathology, Salt Lake City, UT, United States of America.
Thomas K. Varghese, Huntsman Cancer Institute, University of Utah School of Medicine, Department of Surgery, Division of Thoracic Surgery, Salt Lake City, UT, United States of America.
Lyska L. Emerson, Huntsman Cancer Institute, University of Utah School of Medicine, Department of Pathology, Salt Lake City, UT, United States of America.

Document Type

Article

Publication Date

1-1-2020

Abstract

Structural variation (SV) is typically defined as variation within the human genome that exceeds 50 base pairs (bp). SV may be copy number neutral or it may involve duplications, deletions, and complex rearrangements. Recent studies have shown SV to be associated with many human diseases. However, studies of SV have been challenging due to technological constraints. With the advent of third generation (long-read) sequencing technology, exploration of longer stretches of DNA not easily examined previously has been made possible. In the present study, we utilized third generation (long-read) sequencing techniques to examine SV in the EGFR landscape of four haplotypes derived from two human samples. We analyzed the EGFR gene and its landscape (+/- 500,000 base pairs) using this approach and were able to identify a region of non-coding DNA with over 90% similarity to the most common activating EGFR mutation in non-small cell lung cancer. Based on previously published Alu-element genome instability algorithms, we propose a molecular mechanism to explain how this non-coding region of DNA may be interacting with and impacting the stability of the EGFR gene and potentially generating this cancer-driver gene. By these techniques, we were also able to identify previously hidden structural variation in the four haplotypes and in the human reference genome (hg38). We applied previously published algorithms to compare the relative stabilities of these five different EGFR gene landscape haplotypes to estimate their relative potentials to generate the EGFR exon 19, 15 bp canonical deletion. To our knowledge, the present study is the first to use the differences in genomic architecture between targeted cancer-linked phased haplotypes to estimate their relative potentials to form a common cancer-linked driver mutation.

Publication Source (Journal or Book title)

PloS one

First Page

e0226340

This document is currently not available here.

COinS