A new strategy for genome assembly using short sequence reads and reduced representation libraries.

PubWeight™: 1.22‹?› | Rank: Top 10%

🔗 View Article (PMC 2813480)

Published in Genome Res on February 01, 2010

Authors

Andrew L Young1, Hatice Ozel Abaan, Daniel Zerbino, James C Mullikin, Ewan Birney, Elliott H Margulies

Author Affiliations

1: Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA.

Articles citing this

Reference-guided assembly of four diverse Arabidopsis thaliana genomes. Proc Natl Acad Sci U S A (2011) 2.67

Challenges of sequencing human genomes. Brief Bioinform (2010) 2.39

A strategy for direct mapping and identification of mutations by whole-genome sequencing. Genetics (2010) 2.08

Reconstructing ancient genomes and epigenomes. Nat Rev Genet (2015) 1.06

Levenshtein error-correcting barcodes for multiplexed DNA sequencing. BMC Bioinformatics (2013) 0.91

Positional information resolves structural variations and uncovers an evolutionarily divergent genetic locus in accessions of Arabidopsis thaliana. Genome Biol Evol (2011) 0.91

Generation of SNP datasets for orangutan population genomics using improved reduced-representation sequencing and direct comparisons of SNP calling algorithms. BMC Genomics (2014) 0.88

Sugarcane genome sequencing by methylation filtration provides tools for genomic research in the genus Saccharum. Plant J (2014) 0.87

A combinatorial approach to the restriction of a mouse genome. BMC Res Notes (2013) 0.84

SNP-based genetic linkage map of tobacco (Nicotiana tabacum L.) using next-generation RAD sequencing. J Biol Res (Thessalon) (2015) 0.82

Multiple changes in peptide and lipid expression associated with regeneration in the nervous system of the medicinal leech. PLoS One (2011) 0.81

Draft Genome Sequence of Kurthia huakuii LAM0618T, an Organic-Pollutant-Degrading Strain Isolated from Biogas Slurry. Genome Announc (2014) 0.77

Single-nucleotide polymorphism identification and genotyping in Camelina sativa. Mol Breed (2015) 0.76

Articles cited by this

Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res (2008) 151.16

Accurate whole human genome sequencing using reversible terminator chemistry. Nature (2008) 90.20

Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature (2007) 75.09

The genome sequence of Drosophila melanogaster. Science (2000) 74.32

SSAHA: a fast search method for large DNA databases. Genome Res (2001) 48.64

The diploid genome sequence of an Asian individual. Nature (2008) 46.29

ABySS: a parallel assembler for short read sequence data. Genome Res (2009) 43.20

DNA sequencing of a cytogenetically normal acute myeloid leukaemia genome. Nature (2008) 38.13

Systematic discovery of regulatory motifs in human promoters and 3' UTRs by comparison of several mammals. Nature (2005) 31.60

ALLPATHS: de novo assembly of whole-genome shotgun microreads. Genome Res (2008) 20.61

An SNP map of the human genome generated by reduced representation shotgun sequencing. Nature (2000) 19.19

Evolution of genes and genomes on the Drosophila phylogeny. Nature (2007) 18.01

The phusion assembler. Genome Res (2003) 15.25

De novo bacterial genome sequencing: millions of very short reads assembled on a desktop computer. Genome Res (2008) 14.90

Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures. Nature (2007) 11.66

FlyBase: genes and gene models. Nucleic Acids Res (2005) 11.15

In vivo enhancer analysis of human conserved non-coding sequences. Nature (2006) 10.60

The UCSC Genome Browser Database: update 2009. Nucleic Acids Res (2008) 10.31

Whole-genome sequencing and assembly with high-throughput, short-read technologies. PLoS One (2007) 8.70

SNP discovery and allele frequency estimation by deep sequencing of reduced representation libraries. Nat Methods (2008) 8.26

Finishing a whole-genome shotgun: release 3 of the Drosophila melanogaster euchromatic genome sequence. Genome Biol (2002) 8.07

Minimus: a fast, lightweight genome assembler. BMC Bioinformatics (2007) 7.65

Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome. Genome Res (2007) 7.05

Intraspecific nuclear DNA variation in Drosophila. Mol Biol Evol (1996) 6.29

Systematic discovery of regulatory motifs in conserved regions of the human genome, including thousands of CTCF insulator sites. Proc Natl Acad Sci U S A (2007) 4.91

A model of the statistical power of comparative genome sequence analysis. PLoS Biol (2005) 4.54

An intermediate grade of finished genomic sequence suitable for comparative analyses. Genome Res (2004) 4.38

Construction of a general human chromosome jumping library, with application to cystic fibrosis. Science (1987) 3.87

Targeted high-throughput sequencing of tagged nucleic acid samples. Nucleic Acids Res (2007) 3.56

Geographical distribution and diversity of bacteria associated with natural populations of Drosophila melanogaster. Appl Environ Microbiol (2007) 1.80

Drosophila germline sex determination: integration of germline autonomous cues and somatic signals. Curr Top Dev Biol (2008) 0.88

A directional recombination cloning system for restriction- and ligation-free construction of GFP, DsRed, and lacZ transgenic Drosophila reporters. Gene (2007) 0.87

Articles by these authors

Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res (2008) 151.16

Initial sequencing and comparative analysis of the mouse genome. Nature (2002) 96.15

Accurate whole human genome sequencing using reversible terminator chemistry. Nature (2008) 90.20

A second generation human haplotype map of over 3.1 million SNPs. Nature (2007) 85.39

Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature (2007) 75.09

The Bioperl toolkit: Perl modules for the life sciences. Genome Res (2002) 58.63

The Pfam protein families database. Nucleic Acids Res (2002) 51.34

Patterns of somatic mutation in human cancer genomes. Nature (2007) 38.41

Mapping and sequencing of structural variation from eight human genomes. Nature (2008) 30.28

Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics (2005) 24.54

Genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature (2004) 24.40

A Bayesian deconvolution strategy for immunoprecipitation-based DNA methylome analysis. Nat Biotechnol (2008) 21.72

The genome sequence of the malaria mosquito Anopheles gambiae. Science (2002) 20.36

International network of cancer genome projects. Nature (2010) 20.35

A small-cell lung cancer genome with complex signatures of tobacco exposure. Nature (2009) 18.39

EnsMart: a generic system for fast and flexible access to biological data. Genome Res (2004) 17.64

Genome-wide detection and characterization of positive selection in human populations. Nature (2007) 17.27

Evolutionary and biomedical insights from the rhesus macaque genome. Science (2007) 16.21

High-resolution mapping and characterization of open chromatin across the genome. Cell (2008) 15.93

Reactome knowledgebase of human biological pathways and processes. Nucleic Acids Res (2008) 15.69

The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes. Genome Res (2009) 14.90

Ensembl 2011. Nucleic Acids Res (2010) 14.68

The International Protein Index: an integrated database for proteomics experiments. Proteomics (2004) 14.67

Ensembl 2012. Nucleic Acids Res (2011) 14.55

Reactome: a knowledge base of biologic pathways and processes. Genome Biol (2007) 13.36

The genome sequence of Caenorhabditis briggsae: a platform for comparative genomics. PLoS Biol (2003) 13.32

EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates. Genome Res (2008) 12.72

Ensembl 2014. Nucleic Acids Res (2013) 12.62

Prepublication data sharing. Nature (2009) 12.24

Ensembl 2013. Nucleic Acids Res (2012) 11.70

Optimized design and assessment of whole genome tiling arrays. Bioinformatics (2007) 11.38

Reactome: a database of reactions, pathways and biological processes. Nucleic Acids Res (2010) 11.23

Ensembl's 10th year. Nucleic Acids Res (2009) 10.82

Mouse genomic variation and its effect on phenotypes and gene regulation. Nature (2011) 10.66

Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels. Bioinformatics (2012) 9.68

Comparative genome and proteome analysis of Anopheles gambiae and Drosophila melanogaster. Science (2002) 9.43

Genome sequence of Aedes aegypti, a major arbovirus vector. Science (2007) 9.19

The BioPAX community standard for pathway data sharing. Nat Biotechnol (2010) 9.19

Complete Khoisan and Bantu genomes from southern Africa. Nature (2010) 9.06

Accurate and comprehensive sequencing of personal genomes. Genome Res (2011) 8.99

A high-resolution map of human evolutionary constraint using 29 mammals. Nature (2011) 8.67

The Reactome pathway knowledgebase. Nucleic Acids Res (2013) 8.56

A mosaic activating mutation in AKT1 associated with the Proteus syndrome. N Engl J Med (2011) 8.26

Focused evolution of HIV-1 neutralizing antibodies revealed by structures and deep sequencing. Science (2011) 7.92

Enredo and Pecan: genome-wide mammalian consistency-based multiple alignment with paralogs. Genome Res (2008) 7.35

The Ensembl core software libraries. Genome Res (2004) 7.30

The HGNC Database in 2008: a resource for the human genome. Nucleic Acids Res (2007) 7.29

EGASP: the human ENCODE Genome Annotation Assessment Project. Genome Biol (2006) 7.06

Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome. Genome Res (2007) 7.05

The DNA sequence of the human X chromosome. Nature (2005) 6.97

The ClinSeq Project: piloting large-scale genome sequencing for research in genomic medicine. Genome Res (2009) 6.83

Measurement of the human allele frequency spectrum demonstrates greater genetic drift in East Asians than in Europeans. Nat Genet (2007) 6.63

Integrating biological data--the Distributed Annotation System. BMC Bioinformatics (2008) 6.56

The mosaic structure of variation in the laboratory mouse genome. Nature (2002) 6.54

The European Nucleotide Archive. Nucleic Acids Res (2010) 6.48

Challenges and standards in integrating surveys of structural variation. Nat Genet (2007) 6.05

Heritable individual-specific and allele-specific chromatin signatures in humans. Science (2010) 5.94

Systematic comparison of three genomic enrichment methods for massively parallel DNA sequencing. Genome Res (2010) 5.76

Genome analysis of the platypus reveals unique signatures of evolution. Nature (2008) 5.74

Genome-wide mapping of DNase hypersensitive sites using massively parallel signature sequencing (MPSS). Genome Res (2005) 5.71

The landscape of histone modifications across 1% of the human genome in five human cell lines. Genome Res (2007) 5.67

The NGS WikiBook: a dynamic collaborative online training effort with long-term sustainability. Brief Bioinform (2013) 5.67

Efficient storage of high throughput DNA sequencing data using reference-based compression. Genome Res (2011) 5.60

Immunity-related genes and gene families in Anopheles gambiae. Science (2002) 5.47

Co-evolution of a broadly neutralizing HIV-1 antibody and founder virus. Nature (2013) 5.35

Petabyte-scale innovations at the European Nucleotide Archive. Nucleic Acids Res (2008) 5.21

The genomic basis of adaptive evolution in threespine sticklebacks. Nature (2012) 5.20

Genome-wide nucleotide-level mammalian ancestor reconstruction. Genome Res (2008) 5.12

Improvements to services at the European Nucleotide Archive. Nucleic Acids Res (2009) 5.00

A physical map of the mouse genome. Nature (2002) 4.97

An integrated resource for genome-wide identification and analysis of human tissue-specific differentially methylated regions (tDMRs). Genome Res (2008) 4.84

Sequence features and chromatin structure around the genomic regions bound by 119 human transcription factors. Genome Res (2012) 4.80

Early-onset stroke and vasculopathy associated with mutations in ADA2. N Engl J Med (2014) 4.70

High-resolution genome-wide in vivo footprinting of diverse transcription factors in human cells. Genome Res (2010) 4.69

A database and API for variation, dense genotyping and resequencing data. BMC Bioinformatics (2010) 4.68

Initial sequence and comparative analysis of the cat genome. Genome Res (2007) 4.67

Pebble and rock band: heuristic resolution of repeats and scaffolding in the velvet short-read de novo assembler. PLoS One (2009) 4.60

Secondary variants in individuals undergoing exome sequencing: screening of 572 individuals identifies high-penetrance mutations in cancer-susceptibility genes. Am J Hum Genet (2012) 4.45

Sense from sequence reads: methods for alignment and assembly. Nat Methods (2009) 4.44

Open chromatin defined by DNaseI and FAIRE identifies regulatory elements that shape cell-type identity. Genome Res (2011) 4.43

Completing the map of human genetic variation. Nature (2007) 4.38

An intermediate grade of finished genomic sequence suitable for comparative analyses. Genome Res (2004) 4.38

An initial strategy for the systematic identification of functional elements in the human genome by low-redundancy comparative sequencing. Proc Natl Acad Sci U S A (2005) 4.38

Locus Reference Genomic sequences: an improved basis for describing human DNA variants. Genome Med (2010) 4.19

Local DNA topography correlates with functional noncoding regions of the human genome. Science (2009) 4.18

Insights into hominid evolution from the gorilla genome sequence. Nature (2012) 4.12

The genome of the ctenophore Mnemiopsis leidyi and its implications for cell type evolution. Science (2013) 4.07

The implications of alternative splicing in the ENCODE protein complement. Proc Natl Acad Sci U S A (2007) 3.93

Priorities for nucleotide trace, sequence and annotation data capture at the Ensembl Trace Archive and the EMBL Nucleotide Sequence Database. Nucleic Acids Res (2007) 3.84