ALLPATHS: de novo assembly of whole-genome shotgun microreads.

PubWeight™: 20.61‹?› | Rank: Top 0.1% | All-Time Top 10000

🔗 View Article (PMC 2336810)

Published in Genome Res on March 13, 2008

Authors

Jonathan Butler1, Iain MacCallum, Michael Kleber, Ilya A Shlyakhter, Matthew K Belmonte, Eric S Lander, Chad Nusbaum, David B Jaffe

Author Affiliations

1: Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02141, USA.

Articles citing this

(truncated to the top 100)

SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol (2012) 62.36

Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol (2011) 53.86

De novo assembly of human genomes with massively parallel short read sequencing. Genome Res (2009) 45.91

ABySS: a parallel assembler for short read sequence data. Genome Res (2009) 43.20

Sequencing technologies - the next generation. Nat Rev Genet (2009) 40.57

High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc Natl Acad Sci U S A (2010) 22.97

Quake: quality-aware detection and correction of sequencing errors. Genome Biol (2010) 12.52

Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels. Bioinformatics (2012) 9.68

Assembly algorithms for next-generation sequencing data. Genomics (2010) 8.56

Sequencing of natural strains of Arabidopsis thaliana with short reads. Genome Res (2008) 8.44

Assemblathon 1: a competitive assessment of de novo short read assembly methods. Genome Res (2011) 8.38

Genotype and SNP calling from next-generation sequencing data. Nat Rev Genet (2011) 8.34

Annotating genomes with massive-scale RNA sequencing. Genome Biol (2008) 7.73

De novo fragment assembly with short mate-paired reads: Does the read length matter? Genome Res (2008) 7.66

MetaGeneAnnotator: detecting species-specific patterns of ribosomal binding site for precise gene prediction in anonymous prokaryotic and phage genomes. DNA Res (2008) 6.82

ALLPATHS 2: small genomes assembled accurately and with high continuity from short paired reads. Genome Biol (2009) 6.76

Ray: simultaneous assembly of reads from a mix of high-throughput sequencing technologies. J Comput Biol (2010) 6.38

Assembly of large genomes using second-generation sequencing. Genome Res (2010) 5.94

Next-generation transcriptome assembly. Nat Rev Genet (2011) 5.89

De novo assembly and genotyping of variants using colored de Bruijn graphs. Nat Genet (2012) 5.61

Using the Velvet de novo assembler for short-read sequencing technologies. Curr Protoc Bioinformatics (2010) 4.88

SOPRA: Scaffolding algorithm for paired reads via statistical optimization. BMC Bioinformatics (2010) 4.76

Pebble and rock band: heuristic resolution of repeats and scaffolding in the velvet short-read de novo assembler. PLoS One (2009) 4.60

Genome assembly reborn: recent computational challenges. Brief Bioinform (2009) 4.53

Sense from sequence reads: methods for alignment and assembly. Nat Methods (2009) 4.44

Optimization of de novo transcriptome assembly from next-generation sequencing data. Genome Res (2010) 4.18

Mind the gap: upgrading genomes with Pacific Biosciences RS long-read sequencing technology. PLoS One (2012) 4.18

Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species. Gigascience (2013) 4.11

Finished bacterial genomes from shotgun sequence data. Genome Res (2012) 3.86

Assembly complexity of prokaryotic genomes using short reads. BMC Bioinformatics (2010) 3.45

Sequence-based discovery of Bradyrhizobium enterica in cord colitis syndrome. N Engl J Med (2013) 3.39

How to apply de Bruijn graphs to genome assembly. Nat Biotechnol (2011) 3.36

Application of 'next-generation' sequencing technologies to microbial genetics. Nat Rev Microbiol (2009) 3.30

A hybrid approach for the automated finishing of bacterial genomes. Nat Biotechnol (2012) 3.29

RNA-seq: from technology to biology. Cell Mol Life Sci (2009) 3.03

De novo genome sequence assembly of a filamentous fungus using Sanger, 454 and Illumina sequence data. Genome Biol (2009) 2.99

Crystallizing short-read assemblies around seeds. BMC Bioinformatics (2009) 2.89

PRICE: software for the targeted assembly of components of (Meta) genomic sequence data. G3 (Bethesda) (2013) 2.85

Sensitive, specific polymorphism discovery in bacteria using massively parallel sequencing. Nat Methods (2008) 2.61

Characterization of the sesame (Sesamum indicum L.) global transcriptome using Illumina paired-end sequencing and development of EST-SSR markers. BMC Genomics (2011) 2.53

Paired de bruijn graphs: a novel approach for incorporating mate pair information into genome assemblers. J Comput Biol (2011) 2.47

Personal genome sequencing: current approaches and challenges. Genes Dev (2010) 2.38

Comparing de novo genome assembly: the long and short of it. PLoS One (2011) 2.37

Efficient frequency-based de novo short-read clustering for error trimming in next-generation sequencing. Genome Res (2009) 2.36

Bambus 2: scaffolding metagenomes. Bioinformatics (2011) 2.24

Maximum likelihood genome assembly. J Comput Biol (2009) 2.24

Sequence assembly demystified. Nat Rev Genet (2013) 2.09

Gene-boosted assembly of a novel bacterial genome from very short reads. PLoS Comput Biol (2008) 2.04

Assembling genomes using short-read sequencing technology. Genome Biol (2010) 1.95

Efficient counting of k-mers in DNA sequences using a bloom filter. BMC Bioinformatics (2011) 1.76

Evaluation and validation of de novo and hybrid assembly techniques to derive high-quality genome sequences. Bioinformatics (2014) 1.73

ECHO: a reference-free short-read error correction algorithm. Genome Res (2011) 1.68

A draft genome sequence and functional screen reveals the repertoire of type III secreted proteins of Pseudomonas syringae pathovar tabaci 11528. BMC Genomics (2009) 1.67

Fast scaffolding with small independent mixed integer programs. Bioinformatics (2011) 1.59

Computational solutions for omics data. Nat Rev Genet (2013) 1.58

Detection and correction of false segmental duplications caused by genome mis-assembly. Genome Biol (2010) 1.54

CGAL: computing genome assembly likelihoods. Genome Biol (2013) 1.53

Parallel short sequence assembly of transcriptomes. BMC Bioinformatics (2009) 1.47

QSRA: a quality-value guided de novo short read assembler. BMC Bioinformatics (2009) 1.46

Next-generation sequencing of vertebrate experimental organisms. Mamm Genome (2009) 1.42

SEQuel: improving the accuracy of genome assemblies. Bioinformatics (2012) 1.42

Analysis of quality raw data of second generation sequencers with Quality Assessment Software. BMC Res Notes (2011) 1.40

Meraculous: de novo genome assembly with short paired-end reads. PLoS One (2011) 1.37

Calling SNPs without a reference sequence. BMC Bioinformatics (2010) 1.36

Metagenomics: Facts and Artifacts, and Computational Challenges* J Comput Sci Technol (2009) 1.35

Large scale single nucleotide polymorphism discovery in unsequenced genomes using second generation high throughput sequencing technology: applied to turkey. BMC Genomics (2009) 1.31

An ensemble strategy that significantly improves de novo assembly of microbial genomes from metagenomic next-generation sequencing data. Nucleic Acids Res (2015) 1.30

A gene-by-gene population genomics platform: de novo assembly, annotation and genealogical analysis of 108 representative Neisseria meningitidis genomes. BMC Genomics (2014) 1.27

Next-generation sequencing and large genome assemblies. Pharmacogenomics (2012) 1.27

Denoising DNA deep sequencing data-high-throughput sequencing errors and their correction. Brief Bioinform (2015) 1.22

Estimation of sequencing error rates in short reads. BMC Bioinformatics (2012) 1.22

A new strategy for genome assembly using short sequence reads and reduced representation libraries. Genome Res (2010) 1.22

Next-generation sequencing techniques for eukaryotic microorganisms: sequencing-based solutions to biological problems. Eukaryot Cell (2010) 1.18

Comprehensive variation discovery in single human genomes. Nat Genet (2014) 1.18

Integrating sequencing technologies in personal genomics: optimal low cost reconstruction of structural variants. PLoS Comput Biol (2009) 1.18

De Novo assembly of the complete genome of an enhanced electricity-producing variant of Geobacter sulfurreducens using only short reads. PLoS One (2010) 1.15

Complete genome sequences for 59 burkholderia isolates, both pathogenic and near neighbor. Genome Announc (2015) 1.14

Pathset graphs: a novel approach for comprehensive utilization of paired reads in genome assembly. J Comput Biol (2012) 1.14

ATHLATES: accurate typing of human leukocyte antigen through exome sequencing. Nucleic Acids Res (2013) 1.13

Assessing the benefits of using mate-pairs to resolve repeats in de novo short-read prokaryotic assemblies. BMC Bioinformatics (2011) 1.13

De novo assembly, gene annotation and marker development using Illumina paired-end transcriptome sequences in celery (Apium graveolens L.). PLoS One (2013) 1.13

Metagenomics: tools and insights for analyzing next-generation sequencing data derived from biodiversity studies. Bioinform Biol Insights (2015) 1.13

Effects of GC bias in next-generation-sequencing data on de novo genome assembly. PLoS One (2013) 1.13

ExSPAnder: a universal repeat resolver for DNA fragment assembly. Bioinformatics (2014) 1.09

Semi-automatic in silico gap closure enabled de novo assembly of two Dehalobacter genomes from metagenomic data. PLoS One (2012) 1.08

Substantial deletion overlap among divergent Arabidopsis genomes revealed by intersection of short reads and tiling arrays. Genome Biol (2010) 1.08

Parallelized short read assembly of large genomes using de Bruijn graphs. BMC Bioinformatics (2011) 1.08

Next-generation sequence assembly: four stages of data processing and computational challenges. PLoS Comput Biol (2013) 1.07

Extending reference assembly models. Genome Biol (2015) 1.06

AGORA: Assembly Guided by Optical Restriction Alignment. BMC Bioinformatics (2012) 1.06

PAVE: program for assembling and viewing ESTs. BMC Genomics (2009) 1.05

Improving de novo sequence assembly using machine learning and comparative genomics for overlap correction. BMC Bioinformatics (2010) 1.05

Mutation detection with next-generation resequencing through a mediator genome. PLoS One (2010) 1.05

Draft Genome Sequence of Methylomicrobium buryatense Strain 5G, a Haloalkaline-Tolerant Methanotrophic Bacterium. Genome Announc (2013) 1.04

Evaluation of short read metagenomic assembly. BMC Genomics (2011) 1.03

Bridger: a new framework for de novo transcriptome assembly using RNA-seq data. Genome Biol (2015) 1.03

1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life. Nat Biotechnol (2017) 1.01

Draft Genome Assembly of Acinetobacter baumannii ATCC 19606. Genome Announc (2014) 1.01

Joint assembly and genetic mapping of the Atlantic horseshoe crab genome reveals ancient whole genome duplication. Gigascience (2014) 1.01

Transcriptome analysis of the roots at early and late seedling stages using Illumina paired-end sequencing and development of EST-SSR markers in radish. Plant Cell Rep (2012) 1.00

Articles by these authors

Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A (2005) 167.46

Initial sequencing and comparative analysis of the mouse genome. Nature (2002) 96.15

Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature (2007) 75.09

Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature (2007) 65.18

Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol (2011) 53.86

PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat Genet (2003) 53.59

Model-based analysis of ChIP-Seq (MACS). Genome Biol (2008) 51.63

The structure of haplotype blocks in the human genome. Science (2002) 50.88

A bivalent chromatin structure marks key developmental genes in embryonic stem cells. Cell (2006) 48.80

Integrative genomics viewer. Nat Biotechnol (2011) 42.83

Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals. Nature (2009) 35.48

The landscape of somatic copy-number alteration across human cancers. Nature (2010) 31.88

Systematic discovery of regulatory motifs in human promoters and 3' UTRs by comparison of several mammals. Nature (2005) 31.60

Genome-scale DNA methylation maps of pluripotent and differentiated cells. Nature (2008) 30.29

Somatic mutations affect key pathways in lung adenocarcinoma. Nature (2008) 30.02

Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science (2009) 29.83

Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature (2003) 29.16

Transcriptional regulatory code of a eukaryotic genome. Nature (2004) 27.21

Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nat Biotechnol (2009) 27.17

The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. Science (2006) 25.99

Genome sequence, comparative analysis and haplotype structure of the domestic dog. Nature (2005) 23.04

High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc Natl Acad Sci U S A (2010) 22.97

ARACHNE: a whole-genome shotgun assembler. Genome Res (2002) 22.72

Detecting recent positive selection in the human genome from haplotype structure. Nature (2002) 22.00

A molecular signature of metastasis in primary solid tumors. Nat Genet (2002) 21.36

Many human large intergenic noncoding RNAs associate with chromatin-modifying complexes and affect gene expression. Proc Natl Acad Sci U S A (2009) 20.66

International network of cancer genome projects. Nature (2010) 20.35

Genomic maps and comparative analysis of histone modifications in human and mouse. Cell (2005) 18.96

Assessing the significance of chromosomal aberrations in cancer: methodology and application to glioma. Proc Natl Acad Sci U S A (2007) 18.83

A lentiviral RNAi library for human and mouse genes applied to an arrayed viral high-content screen. Cell (2006) 18.81

Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs. Nat Biotechnol (2010) 18.44

The mammalian epigenome. Cell (2007) 18.13

Evolution of genes and genomes on the Drosophila phylogeny. Nature (2007) 18.01

Initial genome sequencing and analysis of multiple myeloma. Nature (2011) 17.28

Genome-wide detection and characterization of positive selection in human populations. Nature (2007) 17.27

Risk alleles for multiple sclerosis identified by a genomewide study. N Engl J Med (2007) 17.06

The mutational landscape of head and neck squamous cell carcinoma. Science (2011) 16.88

Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease. Nat Genet (2003) 16.51

Characterizing the cancer genome in lung adenocarcinoma. Nature (2007) 16.48

Dissecting direct reprogramming through integrative genomic analysis. Nature (2008) 16.47

Assessing the impact of population stratification on genetic association studies. Nat Genet (2004) 16.28

Gene expression correlates of clinical prostate cancer behavior. Cancer Cell (2002) 16.27

Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnol (2013) 16.13

Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature (2002) 15.36

Large-scale sequencing reveals 21U-RNAs and additional microRNAs and endogenous siRNAs in C. elegans. Cell (2006) 15.12

Genetic mapping in human disease. Science (2008) 15.12

The genomic complexity of primary human prostate cancer. Nature (2011) 14.06

Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat Med (2002) 14.01

The NIH Roadmap Epigenomics Mapping Consortium. Nat Biotechnol (2010) 13.99

MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nat Genet (2001) 13.79

Chromosome Conformation Capture Carbon Copy (5C): a massively parallel solution for mapping interactions between genomic elements. Genome Res (2006) 13.32

A landscape of driver mutations in melanoma. Cell (2012) 12.61

High-resolution mapping of copy-number alterations with massively parallel sequencing. Nat Methods (2008) 12.56

Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae. Nature (2004) 12.32

Whole-genome sequence assembly for mammalian genomes: Arachne 2. Genome Res (2003) 12.30

A large intergenic noncoding RNA induced by p53 mediates global gene repression in the p53 response. Cell (2010) 12.27

Mapping the hallmarks of lung adenocarcinoma with massively parallel sequencing. Cell (2012) 11.69

Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1. Nature (2009) 11.46

The genome sequence of the filamentous fungus Neurospora crassa. Nature (2003) 11.39

lincRNAs act in the circuitry controlling pluripotency and differentiation. Nature (2011) 11.31

The mystery of missing heritability: Genetic interactions create phantom heritability. Proc Natl Acad Sci U S A (2012) 11.23

Genomewide analysis of PRC1 and PRC2 occupancy identifies two classes of bivalent domains. PLoS Genet (2008) 11.17

SF3B1 and other novel cancer genes in chronic lymphocytic leukemia. N Engl J Med (2011) 11.07

Genome duplication in the teleost fish Tetraodon nigroviridis reveals the early vertebrate proto-karyotype. Nature (2004) 11.03

Absolute quantification of somatic DNA alterations in human cancer. Nat Biotechnol (2012) 10.87

Genetic screens in human cells using the CRISPR-Cas9 system. Science (2013) 10.75

Detecting novel associations in large data sets. Science (2011) 10.60

Quality scores and SNP detection in sequencing-by-synthesis systems. Genome Res (2008) 10.49

Reactive oxygen species have a causal role in multiple forms of insulin resistance. Nature (2006) 10.07

Exome sequencing identifies recurrent SPOP, FOXA1 and MED12 mutations in prostate cancer. Nat Genet (2012) 9.93

Evolution and impact of subclonal mutations in chronic lymphocytic leukemia. Cell (2013) 9.24

Genome sequence of Aedes aegypti, a major arbovirus vector. Science (2007) 9.19

Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries. Genome Biol (2011) 9.18