MUSCLE: a multiple sequence alignment method with reduced time and space complexity.

PubWeight™: 50.89‹?› | Rank: Top 0.01% | All-Time Top 1000

🔗 View Article (PMC 517706)

Published in BMC Bioinformatics on August 19, 2004

Authors

Robert C Edgar1

Author Affiliations

1: Department of Plant and Microbial Biology, 461 Koshland Hall, University of California, Berkeley, CA 94720-3102, USA. bob@drive5.com

Articles citing this

(truncated to the top 100)

MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol (2011) 220.97

Microbial diversity in the deep sea and the underexplored "rare biosphere". Proc Natl Acad Sci U S A (2006) 42.38

MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res (2005) 31.64

UniFrac--an online tool for comparing microbial community diversity in a phylogenetic context. BMC Bioinformatics (2006) 25.13

progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS One (2010) 24.31

Ensembl 2007. Nucleic Acids Res (2006) 20.10

Phylogeny.fr: robust phylogenetic analysis for the non-specialist. Nucleic Acids Res (2008) 19.09

PyNAST: a flexible tool for aligning sequences to a template alignment. Bioinformatics (2009) 15.93

EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates. Genome Res (2008) 12.72

TreeFam: a curated database of phylogenetic trees of animal gene families. Nucleic Acids Res (2006) 8.83

Salmonella enterica serovar typhimurium exploits inflammation to compete with the intestinal microbiota. PLoS Biol (2007) 7.58

A complete Neandertal mitochondrial genome sequence determined by high-throughput sequencing. Cell (2008) 6.98

TreeFam: 2008 Update. Nucleic Acids Res (2007) 6.63

SINA: accurate high-throughput multiple sequence alignment of ribosomal RNA genes. Bioinformatics (2012) 6.40

An algorithm for progressive multiple alignment of sequences with insertions. Proc Natl Acad Sci U S A (2005) 6.26

Fast statistical alignment. PLoS Comput Biol (2009) 5.92

Macronuclear genome sequence of the ciliate Tetrahymena thermophila, a model eukaryote. PLoS Biol (2006) 5.44

Microbial co-occurrence relationships in the human microbiome. PLoS Comput Biol (2012) 5.35

Genome sequence of the pea aphid Acyrthosiphon pisum. PLoS Biol (2010) 5.33

MEROPS: the peptidase database. Nucleic Acids Res (2009) 5.33

ALTER: program-oriented conversion of DNA and protein alignments. Nucleic Acids Res (2010) 5.32

Genome-wide nucleotide-level mammalian ancestor reconstruction. Genome Res (2008) 5.12

Assessing the root of bilaterian animals with scalable phylogenomic methods. Proc Biol Sci (2009) 5.08

Recent evolutions of multiple sequence alignment algorithms. PLoS Comput Biol (2007) 4.88

Phylogenetic and functional assessment of orthologs inference projects and methods. PLoS Comput Biol (2009) 4.65

The Arabidopsis lyrata genome sequence and the basis of rapid genome size change. Nat Genet (2011) 4.65

PATRIC: the comprehensive bacterial bioinformatics resource with a focus on human pathogenic species. Infect Immun (2011) 4.57

MEROPS: the peptidase database. Nucleic Acids Res (2007) 4.32

PILER-CR: fast and accurate identification of CRISPR repeats. BMC Bioinformatics (2007) 4.28

MEROPS: the peptidase database. Nucleic Acids Res (2006) 4.20

The iPlant Collaborative: Cyberinfrastructure for Plant Biology. Front Plant Sci (2011) 4.16

Specificity prediction of adenylation domains in nonribosomal peptide synthetases (NRPS) using transductive support vector machines (TSVMs). Nucleic Acids Res (2005) 4.15

Widespread genome duplications throughout the history of flowering plants. Genome Res (2006) 4.07

Analysis of homeodomain specificities allows the family-wide prediction of preferred recognition sites. Cell (2008) 3.96

The human phylome. Genome Biol (2007) 3.81

Web services at the European Bioinformatics Institute-2009. Nucleic Acids Res (2009) 3.78

eggNOG v4.0: nested orthology inference across 3686 organisms. Nucleic Acids Res (2013) 3.77

Epochal evolution of GGII.4 norovirus capsid proteins from 1995 to 2006. J Virol (2007) 3.71

Accelerated evolution of the Prdm9 speciation gene across diverse metazoan taxa. PLoS Genet (2009) 3.60

Genome analysis linking recent European and African influenza (H5N1) viruses. Emerg Infect Dis (2007) 3.53

Discovery of the principal specific transcription factors of Apicomplexa and their implication for the evolution of the AP2-integrase DNA binding domains. Nucleic Acids Res (2005) 3.36

PhylomeDB: a database for genome-wide collections of gene phylogenies. Nucleic Acids Res (2007) 3.30

HAMAP: a database of completely sequenced microbial proteome sets and manually curated microbial protein families in UniProtKB/Swiss-Prot. Nucleic Acids Res (2008) 3.29

The effects of alignment quality, distance calculation method, sequence filtering, and region on the analysis of 16S rRNA gene-based studies. PLoS Comput Biol (2010) 3.26

Genome-scale phylogenetic analyses of chikungunya virus reveal independent emergences of recent epidemics and various evolutionary rates. J Virol (2010) 3.18

Long intervals of stasis punctuated by bursts of positive selection in the seasonal evolution of influenza A virus. Biol Direct (2006) 3.18

A specificity map for the PDZ domain family. PLoS Biol (2008) 3.16

Comparative functional analysis of the Caenorhabditis elegans and Drosophila melanogaster proteomes. PLoS Biol (2009) 3.09

A high-throughput DNA sequence aligner for microbial ecology studies. PLoS One (2009) 3.07

PATRIC: the VBI PathoSystems Resource Integration Center. Nucleic Acids Res (2006) 3.06

Toward defining the autoimmune microbiome for type 1 diabetes. ISME J (2010) 3.03

Using sequence similarity networks for visualization of relationships across diverse protein superfamilies. PLoS One (2009) 3.03

Kalign2: high-performance multiple alignment of protein and nucleotide sequences allowing external features. Nucleic Acids Res (2008) 2.97

The ctenophore genome and the evolutionary origins of neural systems. Nature (2014) 2.96

Graemlin: general and robust alignment of multiple large interaction networks. Genome Res (2006) 2.92

Odor coding by a Mammalian receptor repertoire. Sci Signal (2009) 2.84

The complete genome sequence of Haloferax volcanii DS2, a model archaeon. PLoS One (2010) 2.84

Genome sequence of foxtail millet (Setaria italica) provides insights into grass evolution and biofuel potential. Nat Biotechnol (2012) 2.82

Metagenomic pyrosequencing and microbial identification. Clin Chem (2009) 2.81

A reconsideration of the classification of the spider infraorder Mygalomorphae (Arachnida: Araneae) based on three nuclear genes and morphology. PLoS One (2012) 2.73

Simultaneous amino acid substitutions at antigenic sites drive influenza A hemagglutinin evolution. Proc Natl Acad Sci U S A (2007) 2.67

Upcoming challenges for multiple sequence alignment methods in the high-throughput era. Bioinformatics (2009) 2.64

BMGE (Block Mapping and Gathering with Entropy): a new software for selection of phylogenetic informative regions from multiple sequence alignments. BMC Evol Biol (2010) 2.58

The accuracy of several multiple sequence alignment programs for proteins. BMC Bioinformatics (2006) 2.52

Identification of shared single copy nuclear genes in Arabidopsis, Populus, Vitis and Oryza and their phylogenetic utility across various taxonomic levels. BMC Evol Biol (2010) 2.50

Spread, circulation, and evolution of the Middle East respiratory syndrome coronavirus. MBio (2014) 2.50

Origin and evolution of the archaeo-eukaryotic primase superfamily and related palm-domain proteins: structural insights and new members. Nucleic Acids Res (2005) 2.49

The tracrRNA and Cas9 families of type II CRISPR-Cas immunity systems. RNA Biol (2013) 2.44

Efflux-mediated antifungal drug resistance. Clin Microbiol Rev (2009) 2.44

Effect of PCR amplicon size on assessments of clone library microbial diversity and community structure. Environ Microbiol (2009) 2.43

Viral linkage in HIV-1 seroconverters and their partners in an HIV-1 prevention clinical trial. PLoS One (2011) 2.36

The rainbow trout genome provides novel insights into evolution after whole-genome duplication in vertebrates. Nat Commun (2014) 2.36

GiardiaDB and TrichDB: integrated genomic resources for the eukaryotic protist pathogens Giardia lamblia and Trichomonas vaginalis. Nucleic Acids Res (2008) 2.35

FLOWERING LOCUS T protein may act as the long-distance florigenic signal in the cucurbits. Plant Cell (2007) 2.35

Genome sequence of Rickettsia bellii illuminates the role of amoebae in gene exchanges between intracellular pathogens. PLoS Genet (2006) 2.35

An enhanced RNA alignment benchmark for sequence alignment programs. Algorithms Mol Biol (2006) 2.35

Multiple whole-genome alignments without a reference organism. Genome Res (2009) 2.31

Gene3D: modelling protein structure, function and evolution. Nucleic Acids Res (2006) 2.30

Threatened corals provide underexplored microbial habitats. PLoS One (2010) 2.29

Phylogenetic assessment of alignments reveals neglected tree signal in gaps. Genome Biol (2010) 2.27

PhyloFacts: an online structural phylogenomic encyclopedia for protein functional and structural classification. Genome Biol (2006) 2.27

4SALE--a tool for synchronous RNA sequence and secondary structure alignment and editing. BMC Bioinformatics (2006) 2.24

Genomic analysis of oceanic cyanobacterial myoviruses compared with T4-like myoviruses from diverse hosts and environments. Environ Microbiol (2010) 2.23

Metagenomic analysis of the viromes of three North American bat species: viral diversity among different bat species that share a common habitat. J Virol (2010) 2.18

Using QIIME to analyze 16S rRNA gene sequences from microbial communities. Curr Protoc Bioinformatics (2011) 2.15

Dothideomycete plant interactions illuminated by genome sequencing and EST analysis of the wheat pathogen Stagonospora nodorum. Plant Cell (2007) 2.12

A cornucopia of human polyomaviruses. Nat Rev Microbiol (2013) 2.08

Discovery of Novel DENN Proteins: Implications for the Evolution of Eukaryotic Intracellular Membrane Structures and Human Disease. Front Genet (2012) 2.08

Exploring the diversity of Gardnerella vaginalis in the genitourinary tract microbiota of monogamous couples through subtle nucleotide variation. PLoS One (2011) 2.06

MACSE: Multiple Alignment of Coding SEquences accounting for frameshifts and stop codons. PLoS One (2011) 2.06

Mutations in centrosomal protein CEP152 in primary microcephaly families linked to MCPH4. Am J Hum Genet (2010) 2.05

Polymorphic toxin systems: Comprehensive characterization of trafficking modes, processing, mechanisms of action, immunity and ecology using comparative genomics. Biol Direct (2012) 2.05

Taxonomic distribution of large DNA viruses in the sea. Genome Biol (2008) 2.04

Kinase activity of overexpressed HipA is required for growth arrest and multidrug tolerance in Escherichia coli. J Bacteriol (2006) 2.04

Structural analysis of the synaptic protein neuroligin and its beta-neurexin complex: determinants for folding and cell adhesion. Neuron (2007) 2.04

Databases of homologous gene families for comparative genomics. BMC Bioinformatics (2009) 2.01

The tmRDB and SRPDB resources. Nucleic Acids Res (2006) 2.01

Alignment and clustering of phylogenetic markers--implications for microbial diversity studies. BMC Bioinformatics (2010) 2.01

Massively parallel pyrosequencing highlights minority variants in the HIV-1 env quasispecies deriving from lymphomonocyte sub-populations. Retrovirology (2009) 1.99

The evolution of two-component systems in bacteria reveals different strategies for niche adaptation. PLoS Comput Biol (2006) 1.98

Articles cited by this

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res (1997) 665.31

CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res (1994) 392.47

The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol (1987) 266.90

The Protein Data Bank. Nucleic Acids Res (2000) 187.10

MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res (2004) 168.89

T-Coffee: A novel method for fast and accurate multiple sequence alignment. J Mol Biol (2000) 57.88

MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res (2002) 47.62

The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci (1992) 44.38

Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J Mol Evol (1987) 41.41

Optimal alignments in linear space. Comput Appl Biosci (1988) 38.10

Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science (1993) 36.84

SMART, a simple modular architecture research tool: identification of signaling domains. Proc Natl Acad Sci U S A (1998) 36.83

Improved sensitivity of profile searches through the use of sequence weights and gap excision. Comput Appl Biosci (1994) 31.96

Position-based sequence weights. J Mol Biol (1994) 24.41

Amino acid substitution matrices from an information theoretic perspective. J Mol Biol (1991) 23.38

Dirichlet mixtures: a method for improved detection of weak but significant protein sequence homology. Comput Appl Biosci (1996) 19.74

Touring protein fold space with Dali/FSSP. Nucleic Acids Res (1998) 18.00

SMART: a web-based tool for the study of genetically mobile domains. Nucleic Acids Res (2000) 17.77

Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments. J Mol Biol (1996) 15.98

A weighting system and algorithm for aligning many phylogenetically related sequences. Comput Appl Biosci (1995) 13.48

BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs. Bioinformatics (1999) 12.64

A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Res (1999) 12.56

Volume changes in protein evolution. J Mol Biol (1994) 12.07

SMART: identification and annotation of domains from signalling and extracellular protein sequences. Nucleic Acids Res (1999) 11.33

NCBI Reference Sequence project: update and current status. Nucleic Acids Res (2003) 11.30

The alignment of sets of sequences and the construction of phyletic trees: an integrated method. J Mol Evol (1984) 8.90

Comprehensive study on iterative algorithms of multiple sequence alignment. Comput Appl Biosci (1995) 8.39

Large-scale comparison of protein sequence alignment algorithms with structure alignments. Proteins (2000) 8.17

On the complexity of multiple sequence alignment. J Comput Biol (1994) 8.00

Recent progress in multiple sequence alignment: a survey. Pharmacogenomics (2002) 7.69

Align-m--a new algorithm for multiple alignment of highly divergent sequences. Bioinformatics (2004) 7.12

Estimating amino acid substitution models: a comparison of Dayhoff's estimator, the resolvent approach and a maximum likelihood method. Mol Biol Evol (2002) 7.03

BAliBASE (Benchmark Alignment dataBASE): enhancements for repeats, transmembrane sequences and circular permutations. Nucleic Acids Res (2001) 6.67

A comparison of scoring functions for protein sequence profile alignment. Bioinformatics (2004) 6.44

Local homology recognition and distance measures in linear time using compressed amino acid alphabets. Nucleic Acids Res (2004) 6.42

Alignment-free sequence comparison-a review. Bioinformatics (2003) 6.29

COACH: profile-profile alignment of protein families using hidden Markov models. Bioinformatics (2004) 6.22

Searching databases of conserved sequence regions by aligning protein multiple-alignments. Nucleic Acids Res (1996) 5.68

Within the twilight zone: a sensitive profile-profile comparison tool based on information theory. J Mol Biol (2002) 4.99

Combining partial order alignment and progressive multiple sequence alignment increases alignment speed and scalability to very large alignment problems. Bioinformatics (2004) 3.83

RASCAL: rapid scanning and correction of multiple sequence alignments. Bioinformatics (2003) 2.79