Authors

David M. Altshuler, Broad Institute
Richard M. Durbin, Wellcome Sanger Institute
Gonçalo R. Abecasis, University of Michigan, Ann Arbor
David R. Bentley, Illumina United Kingdom
Aravinda Chakravarti, Johns Hopkins School of Medicine
Andrew G. Clark, Cornell University
Peter Donnelly, The Wellcome Centre for Human Genetics
Evan E. Eichler, University of Washington School of Medicine
Paul Flicek, European Bioinformatics Institute
Stacey B. Gabriel, Broad Institute
Richard A. Gibbs, Baylor College of Medicine
Eric D. Green, National Human Genome Research Institute (NHGRI)
Matthew E. Hurles, Wellcome Sanger Institute
Bartha M. Knoppers, McGill University
Jan O. Korbel, European Molecular Biology Laboratory Heidelberg
Eric S. Lander, Broad Institute
Charles Lee, Brigham and Women's Hospital
Hans Lehrach, Max Planck Institute for Molecular Genetics
Elaine R. Mardis, Washington University School of Medicine in St. Louis
Gabor T. Marth, Boston College
Gil A. McVean, The Wellcome Centre for Human Genetics
Deborah A. Nickerson, University of Washington School of Medicine
Jeanette P. Schmidt, Thermo Fisher Scientific Inc.
Stephen T. Sherry, National Institutes of Health (NIH)
Jun Wang, BGI-Shenzhen
Richard K. Wilson, Washington University School of Medicine in St. Louis
Huyen Dinh, Baylor College of Medicine
Christie Kovar, Baylor College of Medicine
Sandra Lee, Baylor College of Medicine
Lora Lewis, Baylor College of Medicine
Donna Muzny, Baylor College of Medicine
Jeff Reid, Baylor College of Medicine
Min Wang, Baylor College of Medicine
Xiaodong Fang, BGI-Shenzhen

Document Type

Article

Publication Date

11-1-2012

Abstract

By characterizing the geographic and functional spectrum of human genetic variation, the 1000 Genomes Project aims to build a resource to help to understand the genetic contribution to disease. Here we describe the genomes of 1,092 individuals from 14 populations, constructed using a combination of low-coverage whole-genome and exome sequencing. By developing methods to integrate information across several algorithms and diverse data sources, we provide a validated haplotype map of 38 million single nucleotide polymorphisms, 1.4 million short insertions and deletions, and more than 14,000 larger deletions. We show that individuals from different populations carry different profiles of rare and common variants, and that low-frequency variants show substantial geographic differentiation, which is further increased by the action of purifying selection. We show that evolutionary conservation and coding consequence are key determinants of the strength of purifying selection, that rare-variant load varies substantially across biological pathways, and that each individual contains hundreds of rare non-coding variants at conserved sites, such as motif-disrupting changes in transcription-factor-binding sites. This resource, which captures up to 98% of accessible single nucleotide polymorphisms at a frequency of 1% in related populations, enables analysis of common and low-frequency variants in individuals from diverse, including admixed, populations. © 2012 Macmillan Publishers Limited. All rights reserved.

Publication Source (Journal or Book title)

Nature

First Page

56

Last Page

65

COinS