Amino acid substitution matrices from an information theoretic perspective.

PubWeight™: 23.38‹?› | Rank: Top 0.01% | All-Time Top 10000

🔗 View Article (PMID 2051488)

Published in J Mol Biol on June 05, 1991

Authors

S F Altschul1

Author Affiliations

1: National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894.

Articles citing this

(truncated to the top 100)

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res (1997) 665.31

Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci U S A (1992) 61.33

MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics (2004) 50.89

Predicting deleterious amino acid substitutions. Genome Res (2001) 28.95

Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res (2001) 22.33

BLAST: at the core of a powerful and diverse set of sequence analysis tools. Nucleic Acids Res (2004) 15.43

NCBI BLAST: a better web interface. Nucleic Acids Res (2008) 15.14

HMMER web server: interactive sequence similarity searching. Nucleic Acids Res (2011) 13.00

Applications and statistics for multiple high-scoring segments in molecular sequences. Proc Natl Acad Sci U S A (1993) 12.10

ProbCons: Probabilistic consistency-based multiple sequence alignment. Genome Res (2005) 11.90

Modular arrangement of proteins as inferred from analysis of homology. Protein Sci (1994) 9.38

Protein database searches using compositionally adjusted substitution matrices. FEBS J (2005) 8.14

Noncoding RNA gene detection using comparative sequence analysis. BMC Bioinformatics (2001) 7.07

Local homology recognition and distance measures in linear time using compressed amino acid alphabets. Nucleic Acids Res (2004) 6.42

Query-dependent banding (QDB) for faster RNA similarity searches. PLoS Comput Biol (2007) 6.34

Simultaneous inference of selection and population growth from patterns of variation in the human genome. Proc Natl Acad Sci U S A (2005) 5.94

Gene recognition via spliced sequence alignment. Proc Natl Acad Sci U S A (1996) 5.77

Searching databases of conserved sequence regions by aligning protein multiple-alignments. Nucleic Acids Res (1996) 5.68

A probabilistic model of local sequence alignment that simplifies statistical significance estimation. PLoS Comput Biol (2008) 5.12

Linking genes to literature: text mining, information extraction, and retrieval applications for biology. Genome Biol (2008) 4.36

Comparison of methods for searching protein sequence databases. Protein Sci (1995) 4.29

RSEARCH: finding homologs of single structured RNA sequences. BMC Bioinformatics (2003) 4.14

The estimation of statistical parameters for local alignment score distributions. Nucleic Acids Res (2001) 3.61

IS6110 transposition and evolutionary scenario of the direct repeat locus in a group of closely related Mycobacterium tuberculosis strains. J Bacteriol (1998) 2.97

Domain enhanced lookup time accelerated BLAST. Biol Direct (2012) 2.87

Parameters for accurate genome alignment. BMC Bioinformatics (2010) 2.78

All possible modes of gene action are observed in a global comparison of gene expression in a maize F1 hybrid and its inbred parents. Proc Natl Acad Sci U S A (2006) 2.68

Comparative sequence analysis of ribonucleases HII, III, II PH and D. Nucleic Acids Res (1997) 2.56

The proofreading domain of Escherichia coli DNA polymerase I and other DNA and/or RNA exonuclease domains. Nucleic Acids Res (1997) 2.55

Large-scale trends in the evolution of gene structures within 11 animal genomes. PLoS Comput Biol (2006) 2.42

The compositional adjustment of amino acid substitution matrices. Proc Natl Acad Sci U S A (2003) 2.42

Finding functional sequence elements by multiple local alignment. Nucleic Acids Res (2004) 2.37

Statistical modeling and analysis of the LAGLIDADG family of site-specific endonucleases and identification of an intein that encodes a site-specific endonuclease of the HNH family. Nucleic Acids Res (1997) 2.33

Accelerated probabilistic inference of RNA structure evolution. BMC Bioinformatics (2005) 2.16

Sequence context-specific profiles for homology searching. Proc Natl Acad Sci U S A (2009) 2.05

HMMER web server: 2015 update. Nucleic Acids Res (2015) 1.98

Quantitative parameters for amino acid-base interaction: implications for prediction of protein-DNA binding sites. Nucleic Acids Res (1998) 1.94

Homologous over-extension: a challenge for iterative similarity searches. Nucleic Acids Res (2010) 1.70

PSI-BLAST pseudocounts and the minimum description length principle. Nucleic Acids Res (2008) 1.68

PARSESNP: A tool for the analysis of nucleotide polymorphisms. Nucleic Acids Res (2003) 1.63

The construction and use of log-odds substitution scores for multiple sequence alignment. PLoS Comput Biol (2010) 1.54

Comparing models of evolution for ordered and disordered proteins. Mol Biol Evol (2009) 1.42

RNP-1, an RNA-binding motif is conserved in the DNA-binding cold shock domain. Nucleic Acids Res (1992) 1.40

Selection for chromosome architecture in bacteria. J Mol Evol (2006) 1.36

The variable and conserved interfaces of modeled olfactory receptor proteins. Protein Sci (1999) 1.34

The exchangeability of amino acids in proteins. Genetics (2005) 1.23

Genomics of an extreme psychrophile, Psychromonas ingrahamii. BMC Genomics (2008) 1.19

Transcriptomic and proteomic analyses of pericycle cells of the maize primary root. Plant Physiol (2007) 1.17

Genomic analysis of carbon source metabolism of Shewanella oneidensis MR-1: Predictions versus experiments. J Bacteriol (2006) 1.16

Gene fusions and gene duplications: relevance to genomic annotation and functional analysis. BMC Genomics (2005) 1.15

Assessing the evolutionary rate of positional orthologous genes in prokaryotes using synteny data. BMC Evol Biol (2007) 1.13

The divalent metal transporter homologues SMF-1/2 mediate dopamine neuron sensitivity in caenorhabditis elegans models of manganism and parkinson disease. J Biol Chem (2009) 1.07

Evolution by leaps: gene duplication in bacteria. Biol Direct (2009) 1.06

Preparation of name and address data for record linkage using hidden Markov models. BMC Med Inform Decis Mak (2002) 1.06

Similarity queries for temporal toxicogenomic expression profiles. PLoS Comput Biol (2008) 1.06

Linking yeast genetics to mammalian genomes: identification and mapping of the human homolog of CDC27 via the expressed sequence tag (EST) data base. Proc Natl Acad Sci U S A (1993) 1.05

Cooperative DNA and histone binding by Uhrf2 links the two major repressive epigenetic pathways. J Cell Biochem (2011) 1.05

A topology-based metric for measuring term similarity in the gene ontology. Adv Bioinformatics (2012) 1.04

Evolutionary patterns in coiled-coils. Genome Biol Evol (2015) 1.00

The 'functional' dyad of scorpion toxin Pi1 is not itself a prerequisite for toxin binding to the voltage-gated Kv1.2 potassium channels. Biochem J (2004) 0.98

MHC-I-restricted epitopes conserved among variola and other related orthopoxviruses are recognized by T cells 30 years after vaccination. Arch Virol (2008) 0.98

Statistical significance of optical map alignments. J Comput Biol (2012) 0.97

Compositional adjustment of Dirichlet mixture priors. J Comput Biol (2010) 0.96

Comprehensive viral oligonucleotide probe design using conserved protein regions. Nucleic Acids Res (2007) 0.96

Genome bias influences amino acid choices: analysis of amino acid substitution and re-compilation of substitution matrices exclusive to an AT-biased genome. Nucleic Acids Res (2008) 0.94

ATP-dependent sugar transport complexity in human erythrocytes. Am J Physiol Cell Physiol (2006) 0.94

BRONCO: Biomedical entity Relation ONcology COrpus for extracting gene-variant-disease-drug relations. Database (Oxford) (2016) 0.94

ProteMiner-SSM: a web server for efficient analysis of similar protein tertiary substructures. Nucleic Acids Res (2004) 0.94

SIRT1 gene expression upon genotoxic damage is regulated by APE1 through nCaRE-promoter elements. Mol Biol Cell (2013) 0.92

Subfamily specific conservation profiles for proteins based on n-gram patterns. BMC Bioinformatics (2008) 0.92

The metal transporter SMF-3/DMT-1 mediates aluminum-induced dopamine neuron degeneration. J Neurochem (2012) 0.92

Nature of protein family signatures: insights from singular value analysis of position-specific scoring matrices. PLoS One (2008) 0.91

The Dfam database of repetitive DNA families. Nucleic Acids Res (2015) 0.91

Adjusting scoring matrices to correct overextended alignments. Bioinformatics (2013) 0.91

Scoring protein relationships in functional interaction networks predicted from sequence data. PLoS One (2011) 0.88

Amino acid "little Big Bang": representing amino acid substitution matrices as dot products of Euclidian vectors. BMC Bioinformatics (2010) 0.87

Universal oligonucleotide microarray for sub-typing of Influenza A virus. PLoS One (2011) 0.87

A strategy to retrieve the whole set of protein modules in microbial proteomes. Genome Res (2002) 0.87

Protein sequence randomness and sequence/structure correlations. Biophys J (1995) 0.86

Where does the alignment score distribution shape come from? Evol Bioinform Online (2010) 0.86

Albusin B, a bacteriocin from the ruminal bacterium Ruminococcus albus 7 that inhibits growth of Ruminococcus flavefaciens. Appl Environ Microbiol (2004) 0.86

Genes optimized by evolution for accurate and fast translation encode in Archaea and Bacteria a broad and characteristic spectrum of protein functions. BMC Genomics (2010) 0.86

Gentle masking of low-complexity sequences improves homology search. PLoS One (2011) 0.86

Probing the substrate binding site of Candida tenuis xylose reductase (AKR2B5) with site-directed mutagenesis. Biochem J (2006) 0.85

Protemot: prediction of protein binding sites with automatically extracted geometrical templates. Nucleic Acids Res (2006) 0.85

A genome alignment algorithm based on compression. BMC Bioinformatics (2010) 0.85

An assessment of substitution scores for protein profile-profile comparison. Bioinformatics (2011) 0.85

Genomic analysis of a 1 Mb region near the telomere of Hessian fly chromosome X2 and avirulence gene vH13. BMC Genomics (2006) 0.84

A gp63 based vaccine candidate against Visceral Leishmaniasis. Bioinformation (2011) 0.84

Evolution of biological sequences implies an extreme value distribution of type I for both global and local pairwise alignment scores. BMC Bioinformatics (2008) 0.84

Selecting the Right Similarity-Scoring Matrix. Curr Protoc Bioinformatics (2013) 0.83

Sequence alignment as hypothesis testing. J Comput Biol (2011) 0.83

Use of residue pairs in protein sequence-sequence and sequence-structure alignments. Protein Sci (2000) 0.82

A statistical physics perspective on alignment-independent protein sequence comparison. Bioinformatics (2015) 0.81

Zar1 represses translation in Xenopus oocytes and binds to the TCS in maternal mRNAs with different characteristics than Zar2. Biochim Biophys Acta (2013) 0.81

Comparative analysis of transcription start sites using mutual information. Genomics Proteomics Bioinformatics (2006) 0.80

Panning for genes--A visual strategy for identifying novel gene orthologs and paralogs. Genome Res (1999) 0.80

Gyrodactylus species (Monogenea: Gyrodactylidae) on the cichlid fishes of Senegal, with the description of Gyrodactylus ergensi n. sp. from Mango tilapia, Sarotherodon galilaeus L. (Teleostei: Cichilidae). Parasitol Res (2009) 0.80

Exploring the evolutionary relationship of insulin receptor substrate family using computational biology. PLoS One (2011) 0.79

Similarity searches in genome-wide numerical data sets. Biol Direct (2006) 0.79

Articles by these authors

Basic local alignment search tool. J Mol Biol (1990) 659.07

A protein alignment scoring system sensitive at all evolutionary distances. J Mol Evol (1993) 10.53

Protein database searches for multiple alignments. Proc Natl Acad Sci U S A (1990) 5.52