MUSCLE: multiple sequence alignment with high accuracy and high throughput.

PubWeight™: 168.89‹?› | Rank: Top 0.01% | All-Time Top 100

🔗 View Article (PMC 390337)

Published in Nucleic Acids Res on March 19, 2004


Robert C Edgar1

Author Affiliations


Articles citing this

(truncated to the top 100)

MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics (2004) 50.89

Pfam: clans, web tools and services. Nucleic Acids Res (2006) 34.83

Jalview Version 2--a multiple sequence alignment editor and analysis workbench. Bioinformatics (2009) 31.84

MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res (2005) 31.64

Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat Methods (2013) 31.15

Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol (2011) 28.61

Accuracy and quality of massively parallel DNA pyrosequencing. Genome Biol (2007) 23.64

The Sorcerer II Global Ocean Sampling expedition: northwest Atlantic through eastern tropical Pacific. PLoS Biol (2007) 23.58

PyCogent: a toolkit for making sense from sequence. Genome Biol (2007) 20.64

Origins and evolutionary genomics of the 2009 swine-origin H1N1 influenza A epidemic. Nature (2009) 17.96

Accurate determination of microbial diversity from 454 pyrosequencing data. Nat Methods (2009) 15.25

Impact of diet in shaping gut microbiota revealed by a comparative study in children from Europe and rural Africa. Proc Natl Acad Sci U S A (2010) 15.22

The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families. PLoS Biol (2007) 13.99

Pyrosequencing enumerates and contrasts soil microbial diversity. ISME J (2007) 13.95

AIM2 recognizes cytosolic dsDNA and forms a caspase-1-activating inflammasome with ASC. Nature (2009) 13.23

Ironing out the wrinkles in the rare biosphere through improved OTU clustering. Environ Microbiol (2010) 13.19

ProbCons: Probabilistic consistency-based multiple sequence alignment. Genome Res (2005) 11.90

OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups. Nucleic Acids Res (2006) 11.43

UniProt Knowledgebase: a hub of integrated protein data. Database (Oxford) (2011) 11.43

A new bioinformatics analysis tools framework at EMBL-EBI. Nucleic Acids Res (2010) 11.42

The influenza virus resource at the National Center for Biotechnology Information. J Virol (2007) 11.33

Phytozome: a comparative platform for green plant genomics. Nucleic Acids Res (2011) 11.32

A putative RNA-interference-based immune system in prokaryotes: computational analysis of the predicted enzymatic machinery, functional analogies with eukaryotic RNAi, and hypothetical mechanisms of action. Biol Direct (2006) 11.31

Predicting the functional impact of protein mutations: application to cancer genomics. Nucleic Acids Res (2011) 10.99

Exploring microbial diversity and taxonomy using SSU rRNA hypervariable tag sequencing. PLoS Genet (2008) 10.72

The National Center for Biotechnology Information's Protein Clusters Database. Nucleic Acids Res (2008) 10.64

ConSurf 2010: calculating evolutionary conservation in sequence and structure of proteins and nucleic acids. Nucleic Acids Res (2010) 9.68

A DNA barcode for land plants. Proc Natl Acad Sci U S A (2009) 9.68

antiSMASH: rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences. Nucleic Acids Res (2011) 9.22

Comparative protein structure modeling using Modeller. Curr Protoc Bioinformatics (2006) 8.72

Dindel: accurate indel calls from short-read data. Genome Res (2010) 8.62

The genome of the cucumber, Cucumis sativus L. Nat Genet (2009) 8.19

Reproducible community dynamics of the gastrointestinal microbiota following antibiotic perturbation. Infect Immun (2009) 7.48

A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics (2011) 7.44

Classification of papillomaviruses (PVs) based on 189 PV types and proposal of taxonomic amendments. Virology (2010) 7.24

PROMALS3D: a tool for multiple protein sequence and structure alignments. Nucleic Acids Res (2008) 7.16

Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome. Genome Res (2007) 7.05

Kalign--an accurate and fast multiple sequence alignment algorithm. BMC Bioinformatics (2005) 7.01

Mistranslation-induced protein misfolding as a dominant constraint on coding-sequence evolution. Cell (2008) 6.91

MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res (2007) 6.48

The Newick utilities: high-throughput phylogenetic tree processing in the UNIX shell. Bioinformatics (2010) 6.42

Presenting your structures: the CCP4mg molecular-graphics software. Acta Crystallogr D Biol Crystallogr (2011) 6.27

Cpf1 Is a Single RNA-Guided Endonuclease of a Class 2 CRISPR-Cas System. Cell (2015) 5.97

Viruses in the faecal microbiota of monozygotic twins and their mothers. Nature (2010) 5.92

Evolution of symbiotic bacteria in the distal human intestine. PLoS Biol (2007) 5.79

Expansion of intestinal Prevotella copri correlates with enhanced susceptibility to arthritis. Elife (2013) 5.49

Identification of deleterious mutations within three human genomes. Genome Res (2009) 5.42

Evolutionary conservation of sequence and secondary structures in CRISPR repeats. Genome Biol (2007) 5.40

Multi-platform next-generation sequencing of the domestic turkey (Meleagris gallopavo): genome assembly and analysis. PLoS Biol (2010) 5.39

Global analysis of Cdk1 substrate phosphorylation sites provides insights into evolution. Science (2009) 5.37

Analysis Tool Web Services from the EMBL-EBI. Nucleic Acids Res (2013) 5.29

Plant DNA barcodes and a community phylogeny of a tropical forest dynamics plot in Panama. Proc Natl Acad Sci U S A (2009) 5.27

ESPRIT: estimating species richness using large collections of 16S rRNA pyrosequences. Nucleic Acids Res (2009) 4.93

Comparative analysis of pyrosequencing and a phylogenetic microarray for exploring microbial community structures in the human distal intestine. PLoS One (2009) 4.93

Annotation error in public databases: misannotation of molecular function in enzyme superfamilies. PLoS Comput Biol (2009) 4.76

Population genomics of early events in the ecological differentiation of bacteria. Science (2012) 4.67

Identification of multiple distinct Snf2 subfamilies with conserved structural motifs. Nucleic Acids Res (2006) 4.64

LINE-1 retrotransposition activity in human genomes. Cell (2010) 4.60

PATRIC: the comprehensive bacterial bioinformatics resource with a focus on human pathogenic species. Infect Immun (2011) 4.57

eggNOG v2.0: extending the evolutionary genealogy of genes with enhanced non-supervised orthologous groups, species and functional annotations. Nucleic Acids Res (2009) 4.55

M-Coffee: combining multiple sequence alignment methods with T-Coffee. Nucleic Acids Res (2006) 4.53

Double digest RADseq: an inexpensive method for de novo SNP discovery and genotyping in model and non-model species. PLoS One (2012) 4.52

Structural basis of transcription: role of the trigger loop in substrate specificity and catalysis. Cell (2006) 4.49

Web services at the European bioinformatics institute. Nucleic Acids Res (2007) 4.48

Performance, accuracy, and Web server for evolutionary placement of short sequence reads under maximum likelihood. Syst Biol (2011) 4.45

The draft genomes of soft-shell turtle and green sea turtle yield insights into the development and evolution of the turtle-specific body plan. Nat Genet (2013) 4.40

Molecular evolution of Zika virus during its emergence in the 20(th) century. PLoS Negl Trop Dis (2014) 4.37

Metagenomic study of the oral microbiota by Illumina high-throughput sequencing. J Microbiol Methods (2009) 4.32

PILER-CR: fast and accurate identification of CRISPR repeats. BMC Bioinformatics (2007) 4.28

The evolutionary genetics and emergence of avian influenza viruses in wild birds. PLoS Pathog (2008) 4.17

Expanded classification of hepatitis C virus into 7 genotypes and 67 subtypes: updated criteria and genotype assignment web resource. Hepatology (2014) 4.15

Specificity prediction of adenylation domains in nonribosomal peptide synthetases (NRPS) using transductive support vector machines (TSVMs). Nucleic Acids Res (2005) 4.15

Discovering sequence motifs with arbitrary insertions and deletions. PLoS Comput Biol (2008) 4.10

Novel insights into the genomic basis of citrus canker based on the genome sequences of two strains of Xanthomonas fuscans subsp. aurantifolii. BMC Genomics (2010) 4.03

T-Coffee: a web server for the multiple sequence alignment of protein and RNA sequences using structural information and homology extension. Nucleic Acids Res (2011) 4.00

Comparative analysis of Acinetobacters: three genomes for three lifestyles. PLoS One (2008) 3.94

Unification of Cas protein families and a simple scenario for the origin and evolution of CRISPR-Cas systems. Biol Direct (2011) 3.92

TranslatorX: multiple alignment of nucleotide sequences guided by amino acid translations. Nucleic Acids Res (2010) 3.83

Structural and functional diversity of the microbial kinome. PLoS Biol (2007) 3.74

Genomic and genetic analyses of diversity and plant interactions of Pseudomonas fluorescens. Genome Biol (2009) 3.63

A phylogeny and revised classification of Squamata, including 4161 species of lizards and snakes. BMC Evol Biol (2013) 3.61

The genesis and source of the H7N9 influenza viruses causing human infections in China. Nature (2013) 3.60

Stool substitute transplant therapy for the eradication of Clostridium difficile infection: 'RePOOPulating' the gut. Microbiome (2013) 3.56

EDGAR: a software framework for the comparative analysis of prokaryotic genomes. BMC Bioinformatics (2009) 3.56

Many species in one: DNA barcoding overestimates the number of species when nuclear mitochondrial pseudogenes are coamplified. Proc Natl Acad Sci U S A (2008) 3.56

Non mycobacterial virulence genes in the genome of the emerging pathogen Mycobacterium abscessus. PLoS One (2009) 3.53

Highly conserved protective epitopes on influenza B viruses. Science (2012) 3.47

Lineage-specific expansion of proteins exported to erythrocytes in malaria parasites. Genome Biol (2006) 3.45

The MPI Bioinformatics Toolkit for protein sequence analysis. Nucleic Acids Res (2006) 3.45

Algorithms, data structures, and numerics for likelihood-based phylogenetic inference of huge trees. BMC Bioinformatics (2011) 3.42

Comparative analysis of a large dataset indicates that internal transcribed spacer (ITS) should be incorporated into the core barcode for seed plants. Proc Natl Acad Sci U S A (2011) 3.41

Genome sequence of Avery's virulent serotype 2 strain D39 of Streptococcus pneumoniae and comparison with that of unencapsulated laboratory strain R6. J Bacteriol (2006) 3.40

Bioinformatics for whole-genome shotgun sequencing of microbial communities. PLoS Comput Biol (2005) 3.39

Mugsy: fast multiple alignment of closely related whole genomes. Bioinformatics (2010) 3.38

Full-genome deep sequencing and phylogenetic analysis of novel human betacoronavirus. Emerg Infect Dis (2013) 3.35

Real-time, portable genome sequencing for Ebola surveillance. Nature (2016) 3.30

Poplar carbohydrate-active enzymes. Gene identification and expression analyses. Plant Physiol (2006) 3.30

Systematic discovery of new recognition peptides mediating protein interaction networks. PLoS Biol (2005) 3.28

BioHealthBase: informatics support in the elucidation of influenza virus host pathogen interactions and virulence. Nucleic Acids Res (2007) 3.28

The effects of alignment quality, distance calculation method, sequence filtering, and region on the analysis of 16S rRNA gene-based studies. PLoS Comput Biol (2010) 3.26

Articles cited by this

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res (1997) 665.31

CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res (1994) 392.47

The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol (1987) 266.90

SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol (1995) 74.88

T-Coffee: A novel method for fast and accurate multiple sequence alignment. J Mol Biol (2000) 57.88

MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res (2002) 47.62

Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J Mol Evol (1987) 41.41

SMART, a simple modular architecture research tool: identification of signaling domains. Proc Natl Acad Sci U S A (1998) 36.83

Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng (1998) 28.09

LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA. Genome Res (2003) 23.03

Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res (2001) 22.33

Dirichlet mixtures: a method for improved detection of weak but significant protein sequence homology. Comput Appl Biosci (1996) 19.74

Touring protein fold space with Dali/FSSP. Nucleic Acids Res (1998) 18.00

SMART: a web-based tool for the study of genetically mobile domains. Nucleic Acids Res (2000) 17.77

Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments. J Mol Biol (1996) 15.98

BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs. Bioinformatics (1999) 12.64

A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Res (1999) 12.56

The ASTRAL compendium for protein structure and sequence analysis. Nucleic Acids Res (2000) 11.38

SMART: identification and annotation of domains from signalling and extracellular protein sequences. Nucleic Acids Res (1999) 11.33

NCBI Reference Sequence project: update and current status. Nucleic Acids Res (2003) 11.30

Rose: generating sequence families. Bioinformatics (1998) 9.56

The alignment of sets of sequences and the construction of phyletic trees: an integrated method. J Mol Evol (1984) 8.90

Comprehensive study on iterative algorithms of multiple sequence alignment. Comput Appl Biosci (1995) 8.39

COMPASS: a tool for comparison of multiple protein alignments with assessment of statistical significance. J Mol Biol (2003) 8.35

Large-scale comparison of protein sequence alignment algorithms with structure alignments. Proteins (2000) 8.17

On the complexity of multiple sequence alignment. J Comput Biol (1994) 8.00

Recent progress in multiple sequence alignment: a survey. Pharmacogenomics (2002) 7.69

Align-m--a new algorithm for multiple alignment of highly divergent sequences. Bioinformatics (2004) 7.12

Estimating amino acid substitution models: a comparison of Dayhoff's estimator, the resolvent approach and a maximum likelihood method. Mol Biol Evol (2002) 7.03

BAliBASE (Benchmark Alignment dataBASE): enhancements for repeats, transmembrane sequences and circular permutations. Nucleic Acids Res (2001) 6.67

A comparison of scoring functions for protein sequence profile alignment. Bioinformatics (2004) 6.44

Local homology recognition and distance measures in linear time using compressed amino acid alphabets. Nucleic Acids Res (2004) 6.42

COACH: profile-profile alignment of protein families using hidden Markov models. Bioinformatics (2004) 6.22

Structure comparison and structure patterns. J Comput Biol (2000) 5.73

Optimal protein structure alignments by multiple linkage clustering: application to distantly related proteins. Protein Eng (1995) 5.55

CAFASP-1: critical assessment of fully automated structure prediction methods. Proteins (1999) 5.54

Phylogenetic inference in protein superfamilies: analysis of SH2 domains. Proc Int Conf Intell Syst Mol Biol (1998) 5.54

APDB: a novel measure for benchmarking sequence alignment methods without reference alignments. Bioinformatics (2003) 4.88

MaxBench: evaluation of sequence and structure comparison methods. Bioinformatics (2002) 4.83