Analysis of genomic variation in non-coding elements using population-scale sequencing data from the 1000 Genomes Project.

PubWeight™: 1.43‹?› | Rank: Top 5%

🔗 View Article (PMC 3167619)

Published in Nucleic Acids Res on May 19, 2011

Authors

Xinmeng Jasmine Mu1, Zhi John Lu, Yong Kong, Hugo Y K Lam, Mark B Gerstein

Author Affiliations

1: Program in Computational Biology and Bioinformatics, Department of Molecular Biophysics and Biochemistry, W.M. Keck Foundation Biotechnology Resource Laboratory, Yale University, New Haven, CT 06520, USA.

Articles citing this

Repetitive DNA and next-generation sequencing: computational challenges and solutions. Nat Rev Genet (2011) 5.58

Integrative annotation of variants from 1092 humans: application to cancer genomics. Science (2013) 2.98

Evidence of abundant purifying selection in humans for recently acquired regulatory functions. Science (2012) 2.81

The real cost of sequencing: higher than you think! Genome Biol (2011) 2.22

Genome-wide inference of natural selection on human transcription factor binding sites. Nat Genet (2013) 2.03

FunSeq2: a framework for prioritizing noncoding regulatory variants in cancer. Genome Biol (2014) 1.36

Global population-specific variation in miRNA associated with cancer risk and clinical biomarkers. BMC Med Genomics (2014) 1.13

Large-scale functional organization of long-range chromatin interaction networks. Cell Rep (2012) 1.08

Human genomic disease variants: a neutral evolutionary explanation. Genome Res (2012) 1.02

Circular RNAs are depleted of polymorphisms at microRNA binding sites. Bioinformatics (2014) 1.01

Uncovering networks from genome-wide association studies via circular genomic permutation. G3 (Bethesda) (2012) 0.93

Exome Sequencing: Current and Future Perspectives. G3 (Bethesda) (2015) 0.91

Response to comment on "Evidence of abundant purifying selection in humans for recently acquired regulatory functions". Science (2013) 0.87

Worldwide genetic variation at the 3' untranslated region of the HLA-G gene: balancing selection influencing genetic diversity. Genes Immun (2013) 0.86

Approximation to the distribution of fitness effects across functional categories in human segregating polymorphisms. PLoS Genet (2014) 0.84

Balancing immunity and tolerance: genetic footprint of natural selection in the transcriptional regulatory region of HLA-G. Genes Immun (2014) 0.82

Single nucleotide polymorphisms can create alternative polyadenylation signals and affect gene expression through loss of microRNA-regulation. PLoS Comput Biol (2012) 0.81

Fine-scale signatures of molecular evolution reconcile models of indel-associated mutation. Genome Biol Evol (2013) 0.80

Mutational Biases Drive Elevated Rates of Substitution at Regulatory Sites across Cancer Types. PLoS Genet (2016) 0.78

Cis-regulatory elements and human evolution. Curr Opin Genet Dev (2014) 0.78

Natural variability of minimotifs in 1092 people indicates that minimotifs are targets of evolution. Nucleic Acids Res (2015) 0.77

Functional Implications of Human-Specific Changes in Great Ape microRNAs. PLoS One (2016) 0.76

Purifying selection in deeply conserved human enhancers is more consistent than in coding sequences. PLoS One (2014) 0.76

Induced somatic sector analysis of cellulose synthase (CesA) promoter regions in woody stem tissues. Planta (2012) 0.75

Natural Selection and Functional Potentials of Human Noncoding Elements Revealed by Analysis of Next Generation Sequencing Data. PLoS One (2015) 0.75

Addressing Benefits, Risks and Consent in Next Generation Sequencing Studies. J Clin Res Bioeth (2015) 0.75

Different evolutionary patterns of SNPs between domains and unassigned regions in human protein-coding sequences. Mol Genet Genomics (2016) 0.75

APOBEC3A/B-induced mutagenesis is responsible for 20% of heritable mutations in the TpCpW context. Genome Res (2016) 0.75

The roles of RNA processing in translating genotype to phenotype. Nat Rev Mol Cell Biol (2016) 0.75

Articles cited by this

Initial sequencing and analysis of the human genome. Nature (2001) 212.86

The human genome browser at UCSC. Genome Res (2002) 168.23

A map of human genome variation from population-scale sequencing. Nature (2010) 121.13

Initial sequencing and comparative analysis of the mouse genome. Nature (2002) 96.15

A second generation human haplotype map of over 3.1 million SNPs. Nature (2007) 85.39

Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature (2007) 75.09

Genome-wide mapping of in vivo protein-DNA interactions. Science (2007) 64.92

The complete genome of an individual by massively parallel DNA sequencing. Nature (2008) 52.81

Model-based analysis of ChIP-Seq (MACS). Genome Biol (2008) 51.63

The diploid genome sequence of an Asian individual. Nature (2008) 46.29

The diploid genome sequence of an individual human. PLoS Biol (2007) 44.80

A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature (2001) 42.18

Integrating common and rare genetic variation in diverse human populations. Nature (2010) 32.30

Adaptive protein evolution at the Adh locus in Drosophila. Nature (1991) 31.65

Paired-end mapping reveals extensive structural variation in the human genome. Science (2007) 30.46

Mapping and sequencing of structural variation from eight human genomes. Nature (2008) 30.28

Origins and functional impact of copy number variation in the human genome. Nature (2009) 23.63

The influence of CCL3L1 gene-containing segmental duplications on HIV-1/AIDS susceptibility. Science (2005) 17.00

A user's guide to the encyclopedia of DNA elements (ENCODE). PLoS Biol (2011) 16.53

GENCODE: producing a reference annotation for ENCODE. Genome Biol (2006) 15.08

Ensembl 2011. Nucleic Acids Res (2010) 14.68

A coding-independent function of gene and pseudogene mRNAs regulates tumour biology. Nature (2010) 13.82

Mapping copy number variation by population-scale genome sequencing. Nature (2011) 12.55

PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls. Nat Biotechnol (2009) 11.28

Deletion polymorphism upstream of IRGM associated with altered IRGM expression and Crohn's disease. Nat Genet (2008) 9.52

Variation in transcription factor binding among humans. Science (2010) 9.33

Mechanisms of change in gene copy number. Nat Rev Genet (2009) 9.01

The first Korean genome sequence and analysis: full genome sequencing for a socio-ethnic group. Genome Res (2009) 7.87

Patterns of linkage disequilibrium in the human genome. Nat Rev Genet (2002) 7.09

Systematic meta-analyses and field synopsis of genetic association studies in schizophrenia: the SzGene database. Nat Genet (2008) 6.71

GtRNAdb: a database of transfer RNA genes detected in genomic sequence. Nucleic Acids Res (2008) 6.07

Heritable individual-specific and allele-specific chromatin signatures in humans. Science (2010) 5.94

Deletion of the late cornified envelope LCE3B and LCE3C genes as a susceptibility factor for psoriasis. Nat Genet (2009) 5.93

Nucleotide-resolution analysis of structural variants using BreakSeq and a breakpoint library. Nat Biotechnol (2009) 5.13

Gene loss, protein sequence divergence, gene dispensability, expression level, and interactivity are correlated in eukaryotic evolution. Genome Res (2003) 3.90

Accelerated evolution of conserved noncoding sequences in humans. Science (2006) 3.87

Pseudogene.org: a comprehensive database and comparison platform for pseudogene annotation. Nucleic Acids Res (2006) 3.62

Close association of RNA polymerase II and many transcription factors with Pol III genes. Proc Natl Acad Sci U S A (2010) 2.90

Transcribed processed pseudogenes in the human genome: an intermediate form of expressed retrosequence lacking protein-coding ability. Nucleic Acids Res (2005) 2.89

Patterns of nucleotide substitution, insertion and deletion in the human genome inferred from pseudogenes. Nucleic Acids Res (2003) 2.85

The implications of structured 5' untranslated regions on translation and disease. Semin Cell Dev Biol (2004) 2.78

Non-coding RNAs: hope or hype? Trends Genet (2005) 2.74

Gene expression intensity shapes evolutionary rates of the proteins encoded by the vertebrate genome. Genetics (2004) 2.63

A stress-responsive RNA switch regulates VEGFA expression. Nature (2008) 2.58

Microdeletions and microinsertions causing human genetic disease: common mechanisms of mutagenesis and the role of local DNA sequence complexity. Hum Mutat (2005) 2.55

Large-scale analysis of pseudogenes in the human genome. Curr Opin Genet Dev (2004) 2.38

Annotating non-coding regions of the genome. Nat Rev Genet (2010) 2.38

The genetic architecture of Down syndrome phenotypes revealed by high-resolution analysis of human segmental trisomies. Proc Natl Acad Sci U S A (2009) 2.36

Analysis of copy number variants and segmental duplications in the human genome: Evidence for a change in the process of formation in recent evolutionary history. Genome Res (2008) 2.17

Meta-analysis of indels causing human genetic disease: mechanisms of mutagenesis and the role of local DNA sequence complexity. Hum Mutat (2003) 1.73

Signatures of purifying and local positive selection in human miRNAs. Am J Hum Genet (2009) 1.18

Articles by these authors

Paired-end mapping reveals extensive structural variation in the human genome. Science (2007) 30.46

The genomic complexity of primary human prostate cancer. Nature (2011) 14.06

Mapping copy number variation by population-scale genome sequencing. Nature (2011) 12.55

Personal omics profiling reveals dynamic molecular and medical phenotypes. Cell (2012) 12.32

A systematic survey of loss-of-function variants in human protein-coding genes. Science (2012) 12.25

Unlocking the secrets of the genome. Nature (2009) 11.80

PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls. Nat Biotechnol (2009) 11.28

Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project. Science (2010) 9.78

Variation in transcription factor binding among humans. Science (2010) 9.33

Performance comparison of exome DNA sequencing technologies. Nat Biotechnol (2011) 7.11

Exome sequencing identifies recurrent somatic RAC1 mutations in melanoma. Nat Genet (2012) 7.00

Performance comparison of whole-genome sequencing platforms. Nat Biotechnol (2011) 5.79

Relating three-dimensional structures to protein networks provides evolutionary insights. Science (2006) 5.50

Nucleotide-resolution analysis of structural variants using BreakSeq and a breakpoint library. Nat Biotechnol (2009) 5.13

High-resolution mapping of DNA copy alterations in human chromosome 22 using high-density tiling oligonucleotide arrays. Proc Natl Acad Sci U S A (2006) 4.84

AlleleSeq: analysis of allele-specific expression and binding in a network framework. Mol Syst Biol (2011) 4.71

Mapping of transcription factor binding regions in mammalian cells by ChIP: comparison of array- and sequencing-based technologies. Genome Res (2007) 4.59

Assessment of whole genome amplification-induced bias through high-throughput, massively parallel whole genome sequencing. BMC Genomics (2006) 4.55

The GENCODE pseudogene resource. Genome Biol (2012) 4.18

PEMer: a computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data. Genome Biol (2009) 4.18

An integrated approach for finding overlooked genes in yeast. Nat Biotechnol (2002) 3.88

Pseudogenes in the ENCODE regions: consensus annotation, analysis of transcription, and evolution. Genome Res (2007) 3.82

Molecular characterization of neuroendocrine prostate cancer and identification of new drug targets. Cancer Discov (2011) 3.43

The reality of pervasive transcription. PLoS Biol (2011) 3.41

Systematic prediction and validation of breakpoints associated with copy-number variants in the human genome. Proc Natl Acad Sci U S A (2007) 3.35

A comprehensive map of mobile element insertion polymorphisms in humans. PLoS Genet (2011) 3.14

Distinct genomic aberrations associated with ERG rearranged prostate cancer. Genes Chromosomes Cancer (2009) 3.03

Deciphering protein kinase specificity through large-scale analysis of yeast phosphorylation site motifs. Sci Signal (2010) 2.95

Quantifying environmental adaptation of metabolic pathways in metagenomics. Proc Natl Acad Sci U S A (2009) 2.89

FusionSeq: a modular framework for finding gene fusions by analyzing paired-end RNA-sequencing data. Genome Biol (2010) 2.79

Structured RNAs in the ENCODE selected regions of the human genome. Genome Res (2007) 2.69

Publishing perishing? Towards tomorrow's information architecture. BMC Bioinformatics (2007) 2.55

Positive selection at the protein network periphery: evaluation in terms of structural constraints and cellular context. Proc Natl Acad Sci U S A (2007) 2.55

Discovery of non-ETS gene fusions in human prostate cancer using next-generation RNA sequencing. Genome Res (2010) 2.50

Tilescope: online analysis pipeline for high-density tiling microarray data. Genome Biol (2007) 2.48

Genome-wide identification of binding sites defines distinct functions for Caenorhabditis elegans PHA-4/FOXA in development and environmental response. PLoS Genet (2010) 2.46

Molecular sampling of prostate cancer: a dilemma for predicting disease progression. BMC Med Genomics (2010) 2.40

Toward a universal microarray: prediction of gene expression through nearest-neighbor probe sequence identification. Nucleic Acids Res (2007) 2.39

Diverse roles and interactions of the SWI/SNF chromatin remodeling complex revealed using global approaches. PLoS Genet (2011) 2.38

Annotating non-coding regions of the genome. Nat Rev Genet (2010) 2.38

The genetic architecture of Down syndrome phenotypes revealed by high-resolution analysis of human segmental trisomies. Proc Natl Acad Sci U S A (2009) 2.36

Diverse transcription factor binding features revealed by genome-wide ChIP-seq in C. elegans. Genome Res (2010) 2.30

The real cost of sequencing: higher than you think! Genome Biol (2011) 2.22

A myelopoiesis-associated regulatory intergenic noncoding RNA transcript within the human HOXA cluster. Blood (2009) 2.18

Analysis of copy number variants and segmental duplications in the human genome: Evidence for a change in the process of formation in recent evolutionary history. Genome Res (2008) 2.17

Assessing the need for sequence-based normalization in tiling microarray experiments. Bioinformatics (2007) 2.08

Statistical analysis of the genomic distribution and correlation of regulatory elements in the ENCODE regions. Genome Res (2007) 2.03

Assessing the performance of different high-density tiling microarray strategies for mapping transcribed regions of the human genome. Genome Res (2006) 1.95

An international effort towards developing standards for best practices in analysis, interpretation and reporting of clinical genome sequencing results in the CLARITY Challenge. Genome Biol (2014) 1.95

Network modeling identifies molecular functions targeted by miR-204 to suppress head and neck tumor metastasis. PLoS Comput Biol (2010) 1.92

Integrated analysis of experimental data sets reveals many novel promoters in 1% of the human genome. Genome Res (2007) 1.82

Negligible impact of rare autoimmune-locus coding-region variants on missing heritability. Nature (2013) 1.79

Identification of a disease-defining gene fusion in epithelioid hemangioendothelioma. Sci Transl Med (2011) 1.79

Bayesian modeling of the yeast SH3 domain interactome predicts spatiotemporal dynamics of endocytosis proteins. PLoS Biol (2009) 1.78

Systematic identification of synergistic drug pairs targeting HIV. Nat Biotechnol (2012) 1.77

A computational approach for identifying pseudogenes in the ENCODE regions. Genome Biol (2006) 1.74

Detecting and annotating genetic variations using the HugeSeq pipeline. Nat Biotechnol (2012) 1.72

N-myc downstream regulated gene 1 (NDRG1) is fused to ERG in prostate cancer. Neoplasia (2009) 1.68

High-resolution copy-number variation map reflects human olfactory receptor diversity and evolution. PLoS Genet (2008) 1.60

The ambiguous boundary between genes and pseudogenes: the dead rise up, or do they? Trends Genet (2007) 1.57

Substrate discrimination among mitogen-activated protein kinases through distinct docking sequence motifs. J Biol Chem (2008) 1.57

Chromatin state signatures associated with tissue-specific gene expression and enhancer activity in the embryonic limb. Genome Res (2012) 1.57

The DART classification of unannotated transcription within the ENCODE regions: associating transcription with known and novel loci. Genome Res (2007) 1.56

SenseLab: new developments in disseminating neuroscience information. Brief Bioinform (2007) 1.56

Targeting the human cancer pathway protein interaction network by structural genomics. Mol Cell Proteomics (2008) 1.54

Dynamic and complex transcription factor binding during an inducible response in yeast. Genes Dev (2009) 1.43

Epigenetic repression of miR-31 disrupts androgen receptor homeostasis and contributes to prostate cancer progression. Cancer Res (2012) 1.40

Construction and analysis of an integrated regulatory network derived from high-throughput sequencing data. PLoS Comput Biol (2011) 1.39

Microbial communities of the upper respiratory tract and otitis media in children. MBio (2011) 1.38

The current excitement about copy-number variation: how it relates to gene duplications and protein families. Curr Opin Struct Biol (2008) 1.36

Prediction and characterization of noncoding RNAs in C. elegans by integrating conservation, secondary structure, and high-throughput sequencing and array data. Genome Res (2010) 1.35

Rewiring of transcriptional regulatory networks: hierarchy, rather than connectivity, better reflects the importance of regulators. Sci Signal (2010) 1.35

Measuring the evolutionary rewiring of biological networks. PLoS Comput Biol (2011) 1.35

Robust-linear-model normalization to reduce technical variability in functional protein microarrays. J Proteome Res (2009) 1.29

Analysis of diverse regulatory networks in a hierarchical context shows consistent tendencies for collaboration in the middle levels. Proc Natl Acad Sci U S A (2010) 1.24

Comparing classical pathways and modern networks: towards the development of an edge ontology. Trends Biochem Sci (2007) 1.23

Understanding modularity in molecular networks requires dynamics. Sci Signal (2009) 1.22

Pervasive and dynamic protein binding sites of the mRNA transcriptome in Saccharomyces cerevisiae. Genome Biol (2013) 1.20

Integrated assessment of genomic correlates of protein evolutionary rate. PLoS Comput Biol (2009) 1.20

Getting started in gene orthology and functional analysis. PLoS Comput Biol (2010) 1.20

Computational analysis of membrane proteins: the largest class of drug targets. Drug Discov Today (2009) 1.19

Integrating sequencing technologies in personal genomics: optimal low cost reconstruction of structural variants. PLoS Comput Biol (2009) 1.18

Upper respiratory tract microbial communities, acute otitis media pathogens, and antibiotic use in healthy and sick children. Appl Environ Microbiol (2012) 1.17

Analysis of membrane proteins in metagenomics: networks of correlated environmental features and protein families. Genome Res (2010) 1.13

HingeMaster: normal mode hinge prediction approach and integration of complementary predictors. Proteins (2008) 1.13

Large-scale mutagenesis of the yeast genome using a Tn7-derived multipurpose transposon. Genome Res (2004) 1.11

Comprehensive analysis of the pseudogenes of glycolytic enzymes in vertebrates: the anomalously high number of GAPDH pseudogenes highlights a recent burst of retrotrans-positional activity. BMC Genomics (2009) 1.10

Tiling genomes of pathogenic viruses identifies potent antiviral shRNAs and reveals a role for secondary structure in shRNA efficacy. Proc Natl Acad Sci U S A (2012) 1.09

Hinge Atlas: relating protein sequence to sites of structural flexibility. BMC Bioinformatics (2007) 1.06

Pseudofam: the pseudogene families database. Nucleic Acids Res (2008) 1.06

StoneHinge: hinge prediction by network analysis of individual protein structures. Protein Sci (2009) 1.06

Small RNAs originated from pseudogenes: cis- or trans-acting? PLoS Comput Biol (2009) 1.05

Fluorescence anisotropy studies on the Ku-DNA interaction: anion and cation effects. J Biol Chem (2004) 1.02