Improved tools for biological sequence comparison.

PubWeight™: 193.60‹?› | Rank: Top 0.01% | All-Time Top 100

🔗 View Article (PMC 280013)

Published in Proc Natl Acad Sci U S A on April 01, 1988

Authors

W R Pearson1, D J Lipman

Author Affiliations

1: Department of Biochemistry, University of Virginia, Charlottesville 22908.

Articles citing this

(truncated to the top 100)

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res (1997) 665.31

CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res (1994) 392.47

The Protein Data Bank. Nucleic Acids Res (2000) 187.10

tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res (1997) 142.55

BLAT--the BLAST-like alignment tool. Genome Res (2002) 126.78

RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res (2007) 85.81

WebLogo: a sequence logo generator. Genome Res (2004) 59.58

The Bioperl toolkit: Perl modules for the life sciences. Genome Res (2002) 58.63

ARB: a software environment for sequence data. Nucleic Acids Res (2004) 58.27

BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics (2010) 53.23

Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics (2010) 52.01

CAP3: A DNA sequence assembly program. Genome Res (1999) 50.04

Versatile and open software for comparing large genomes. Genome Biol (2004) 49.45

SSAHA: a fast search method for large DNA databases. Genome Res (2001) 48.64

MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res (2002) 47.62

Human MicroRNA targets. PLoS Biol (2004) 34.51

MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res (2005) 31.64

Protein sequence similarity searches using patterns as seeds. Nucleic Acids Res (1998) 23.87

ARACHNE: a whole-genome shotgun assembler. Genome Res (2002) 22.72

Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res (2001) 22.33

TIGRFAMs: a protein family resource for the functional identification of proteins. Nucleic Acids Res (2001) 20.84

SGD: Saccharomyces Genome Database. Nucleic Acids Res (1998) 20.26

InterProScan: protein domains identifier. Nucleic Acids Res (2005) 18.82

Detection of conserved segments in proteins: iterative scanning of sequence databases with alignment blocks. Proc Natl Acad Sci U S A (1994) 18.46

Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics (2009) 16.52

Selection of representative protein data sets. Protein Sci (1992) 15.62

RNA sequence analysis using covariance models. Nucleic Acids Res (1994) 14.60

Insertion element IS987 from Mycobacterium bovis BCG is located in a hot-spot integration region for insertion elements in Mycobacterium tuberculosis complex strains. Infect Immun (1991) 14.45

Insights into social insects from the genome of the honeybee Apis mellifera. Nature (2006) 13.67

The protein information resource (PIR). Nucleic Acids Res (2000) 13.54

Automatic generation of primary sequence patterns from sets of related protein sequences. Proc Natl Acad Sci U S A (1990) 12.96

The Protein Information Resource: an integrated public resource of functional annotation of proteins. Nucleic Acids Res (2002) 12.20

Applications and statistics for multiple high-scoring segments in molecular sequences. Proc Natl Acad Sci U S A (1993) 12.10

The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res (2009) 12.09

Genome annotation assessment in Drosophila melanogaster. Genome Res (2000) 11.77

A new bioinformatics analysis tools framework at EMBL-EBI. Nucleic Acids Res (2010) 11.42

The Protein Information Resource. Nucleic Acids Res (2003) 11.17

Annotating large genomes with exact word matches. Genome Res (2003) 11.07

Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res (2003) 11.03

A memory-efficient dynamic programming algorithm for optimal alignment of a sequence to an RNA secondary structure. BMC Bioinformatics (2002) 10.84

Temperature gradient gel electrophoresis analysis of 16S rRNA from human fecal samples reveals stable and host-specific communities of active bacteria. Appl Environ Microbiol (1998) 10.67

Complete sequence analysis of the genome of the bacterium Mycoplasma pneumoniae. Nucleic Acids Res (1996) 10.46

PairWise and SearchWise: finding the optimal alignment in a simultaneous comparison of a protein profile against all DNA translation frames. Nucleic Acids Res (1996) 9.82

Major facilitator superfamily. Microbiol Mol Biol Rev (1998) 9.77

Recognition of related proteins by iterative template refinement (ITR). Protein Sci (1994) 9.67

ViennaRNA Package 2.0. Algorithms Mol Biol (2011) 9.43

Hepatitis C virus shares amino acid sequence similarity with pestiviruses and flaviviruses as well as members of two plant virus supergroups. Proc Natl Acad Sci U S A (1990) 8.92

pp125FAK a structurally distinctive protein-tyrosine kinase associated with focal adhesions. Proc Natl Acad Sci U S A (1992) 8.46

Isolation of the Arabidopsis ABI3 gene by positional cloning. Plant Cell (1992) 8.36

The Protein Data Bank and structural genomics. Nucleic Acids Res (2003) 8.34

The EMBL data library. Nucleic Acids Res (1993) 8.24

Molecular characterization of the 50-kD subunit of dynactin reveals function for the complex in chromosome alignment and spindle organization during mitosis. J Cell Biol (1996) 7.65

Experimental determination and system level analysis of essential genes in Escherichia coli MG1655. J Bacteriol (2003) 7.32

The European Bioinformatics Institute (EBI) databases. Nucleic Acids Res (1996) 7.19

Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships. Proc Natl Acad Sci U S A (1998) 7.18

Widespread siRNA "off-target" transcript silencing mediated by seed region sequence complementarity. RNA (2006) 7.06

Kalign--an accurate and fast multiple sequence alignment algorithm. BMC Bioinformatics (2005) 7.01

A high-density rice genetic linkage map with 2275 markers using a single F2 population. Genetics (1998) 6.82

Exploring genomic dark matter: a critical assessment of the performance of homology search methods on noncoding RNA. Genome Res (2006) 6.59

The European Bioinformatics Institute (EBI) databases. Nucleic Acids Res (1994) 6.58

Local homology recognition and distance measures in linear time using compressed amino acid alphabets. Nucleic Acids Res (2004) 6.42

Two Saccharomyces cerevisiae kinesin-related gene products required for mitotic spindle assembly. J Cell Biol (1992) 6.37

Use of a screen for synthetic lethal and multicopy suppressee mutants to identify two new genes involved in morphogenesis in Saccharomyces cerevisiae. Mol Cell Biol (1991) 6.30

Mapping sequenced E.coli genes by computer: software, strategies and examples. Nucleic Acids Res (1991) 6.29

Evolution of the SNF2 family of proteins: subfamilies with distinct sequences and functions. Nucleic Acids Res (1995) 6.24

Characterization of a cis-Golgi matrix protein, GM130. J Cell Biol (1995) 6.22

Analysis of the RNA-recognition motif and RS and RGG domains: conservation in metazoan pre-mRNA splicing factors. Nucleic Acids Res (1993) 6.15

AtDB, the Arabidopsis thaliana database, and graphical-web-display of progress by the Arabidopsis Genome Initiative. Nucleic Acids Res (1998) 6.04

Molecular cloning of a putative receptor protein kinase gene encoded at the self-incompatibility locus of Brassica oleracea. Proc Natl Acad Sci U S A (1991) 5.96

A chemokine expressed in lymphoid high endothelial venules promotes the adhesion and chemotaxis of naive T lymphocytes. Proc Natl Acad Sci U S A (1998) 5.94

Recent changes in the GenBank On-line Service. Nucleic Acids Res (1990) 5.91

Evolutionary families of peptidases. Biochem J (1993) 5.81

The EMBL Data Library. Nucleic Acids Res (1992) 5.68

High-throughput sequence alignment using Graphics Processing Units. BMC Bioinformatics (2007) 5.56

Complete genome sequence of Clostridium perfringens, an anaerobic flesh-eater. Proc Natl Acad Sci U S A (2002) 5.40

Structural, functional, and evolutionary relationships among extracellular solute-binding receptors of bacteria. Microbiol Rev (1993) 5.36

Analysis Tool Web Services from the EMBL-EBI. Nucleic Acids Res (2013) 5.29

Cloning of the Ah-receptor cDNA reveals a distinctive ligand-activated transcription factor. Proc Natl Acad Sci U S A (1992) 5.28

Salmonella maintains the integrity of its intracellular vacuole through the action of SifA. EMBO J (2000) 5.27

Annotation transfer between genomes: protein-protein interologs and protein-DNA regulogs. Genome Res (2004) 5.26

Characterization of VPS34, a gene required for vacuolar protein sorting and vacuole segregation in Saccharomyces cerevisiae. Mol Cell Biol (1990) 5.25

A conserved MYB transcription factor involved in phosphate starvation signaling both in vascular plants and in unicellular algae. Genes Dev (2001) 5.09

Primary characterization of a herpesvirus agent associated with Kaposi's sarcomae. J Virol (1996) 5.05

Transmembrane helices predicted at 95% accuracy. Protein Sci (1995) 5.05

Identification of phosphate starvation-inducible genes in Escherichia coli K-12 by DNA sequence analysis of psi::lacZ(Mu d1) transcriptional fusions. J Bacteriol (1990) 5.02

Improvements to services at the European Nucleotide Archive. Nucleic Acids Res (2009) 5.00

Mry, a trans-acting positive regulator of the M protein gene of Streptococcus pyogenes with similarity to the receptor proteins of two-component regulatory systems. J Bacteriol (1991) 4.99

NDC10: a gene involved in chromosome segregation in Saccharomyces cerevisiae. J Cell Biol (1993) 4.96

MEROPS: the database of proteolytic enzymes, their substrates and inhibitors. Nucleic Acids Res (2011) 4.90

tRNAdb 2009: compilation of tRNA sequences and tRNA genes. Nucleic Acids Res (2008) 4.88

Subtilases: the superfamily of subtilisin-like serine proteases. Protein Sci (1997) 4.86

A conserved double-stranded RNA-binding domain. Proc Natl Acad Sci U S A (1992) 4.83

Antibiotic resistance is ancient. Nature (2011) 4.83

Cloning and sequence of the gene encoding a cefotaxime-hydrolyzing class A beta-lactamase isolated from Escherichia coli. Antimicrob Agents Chemother (1995) 4.81

Correlation between plasmid content and infectivity in Borrelia burgdorferi. Proc Natl Acad Sci U S A (2000) 4.81

High functional diversity in Mycobacterium tuberculosis driven by genetic drift and human demography. PLoS Biol (2008) 4.74

Position-specific chemical modification of siRNAs reduces "off-target" transcript silencing. RNA (2006) 4.72

ParAlign: a parallel sequence alignment algorithm for rapid and sensitive database searches. Nucleic Acids Res (2001) 4.71

Cytoplasmic dynein binds dynactin through a direct interaction between the intermediate chains and p150Glued. J Cell Biol (1995) 4.66

Salmonella typhimurium phoP virulence gene is a transcriptional regulator. Proc Natl Acad Sci U S A (1989) 4.62

Articles by these authors

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res (1997) 665.31

Basic local alignment search tool. J Mol Biol (1990) 659.07

Rapid and sensitive protein similarity searches. Science (1985) 76.83

Rapid similarity searches of nucleic acid and protein data banks. Proc Natl Acad Sci U S A (1983) 53.12

A genomic perspective on protein families. Science (1997) 50.51

GenBank. Nucleic Acids Res (2000) 36.75

Protein sequence similarity searches using patterns as seeds. Nucleic Acids Res (1998) 23.87

GenBank. Nucleic Acids Res (1999) 21.47

On the statistical significance of nucleic acid similarities. Nucleic Acids Res (1984) 18.21

A workbench for multiple alignment construction and analysis. Proteins (1991) 16.96

Weights for data related by a tree. J Mol Biol (1989) 12.63

GenBank. Nucleic Acids Res (1997) 11.73

Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms. Genomics (1991) 9.65

GenBank. Nucleic Acids Res (1998) 9.36

GenBank. Nucleic Acids Res (1993) 9.06

GenBank. Nucleic Acids Res (1996) 7.06

GenBank. Nucleic Acids Res (1994) 5.63

Protein database searches for multiple alignments. Proc Natl Acad Sci U S A (1990) 5.52

Lineage-specific loss and divergence of functionally linked genes in eukaryotes. Proc Natl Acad Sci U S A (2000) 3.27

Extracting protein alignment models from the sequence database. Nucleic Acids Res (1997) 3.17

Contextual constraints on synonymous codon choice. J Mol Biol (1983) 2.47

Equal animals. Nature (1990) 2.09

Identification of class-mu glutathione transferase genes GSTM1-GSTM5 on human chromosome 1p13. Am J Hum Genet (1993) 1.83

Dynamic programming algorithms for biological sequence comparison. Methods Enzymol (1992) 1.65

Tissue-specific induction of murine glutathione transferase mRNAs by butylated hydroxyanisole. J Biol Chem (1988) 1.27

Interaction of silent and replacement changes in eukaryotic coding sequences. J Mol Evol (1985) 1.18

Structural organization of Escherichia coli tRNAtyr gene clusters in four different transducing bacteriophages. J Mol Biol (1979) 1.10

Comparative analysis of nucleic acid sequences by their general constraints. Nucleic Acids Res (1982) 1.00

MIF proteins are not glutathione transferase homologs. Protein Sci (1994) 0.83

Local sequence patterns of hydrophobicity and solvent accessibility in soluble globular proteins. Biopolymers (1987) 0.76

Hierarchical analysis of influenza A hemagglutinin gene sequences. Nucleic Acids Res (1982) 0.75