InterPro in 2011: new developments in the family and domain prediction database.

PubWeight™: 13.45‹?› | Rank: Top 0.1% | All-Time Top 10000

🔗 View Article (PMC 3245097)

Published in Nucleic Acids Res on November 16, 2011

Authors

Sarah Hunter1, Philip Jones, Alex Mitchell, Rolf Apweiler, Teresa K Attwood, Alex Bateman, Thomas Bernard, David Binns, Peer Bork, Sarah Burge, Edouard de Castro, Penny Coggill, Matthew Corbett, Ujjwal Das, Louise Daugherty, Lauranne Duquenne, Robert D Finn, Matthew Fraser, Julian Gough, Daniel Haft, Nicolas Hulo, Daniel Kahn, Elizabeth Kelly, Ivica Letunic, David Lonsdale, Rodrigo Lopez, Martin Madera, John Maslen, Craig McAnulla, Jennifer McDowall, Conor McMenamin, Huaiyu Mi, Prudence Mutowo-Muellenet, Nicola Mulder, Darren Natale, Christine Orengo, Sebastien Pesseat, Marco Punta, Antony F Quinn, Catherine Rivoire, Amaia Sangrador-Vegas, Jeremy D Selengut, Christian J A Sigrist, Maxim Scheremetjew, John Tate, Manjulapramila Thimmajanarthanan, Paul D Thomas, Cathy H Wu, Corin Yeats, Siew-Yit Yong

Author Affiliations

1: EMBL Outstation European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SD Cambridge, UK. hunter@ebi.ac.uk

Articles citing this

(truncated to the top 100)

UniProt: a hub for protein information. Nucleic Acids Res (2014) 16.72

The PRoteomics IDEntifications (PRIDE) database and associated tools: status in 2013. Nucleic Acids Res (2012) 13.14

Activities at the Universal Protein Resource (UniProt). Nucleic Acids Res (2013) 11.67

The carbohydrate-active enzymes database (CAZy) in 2013. Nucleic Acids Res (2013) 10.24

Update on activities at the Universal Protein Resource (UniProt) in 2013. Nucleic Acids Res (2012) 9.65

Comparative protein structure modeling using Modeller. Curr Protoc Bioinformatics (2006) 8.72

InterProScan 5: genome-scale protein function classification. Bioinformatics (2014) 8.52

The InterPro protein families database: the classification resource after 15 years. Nucleic Acids Res (2014) 6.00

MicroScope--an integrated microbial resource for the curation and comparative analysis of genomic and metabolic data. Nucleic Acids Res (2012) 5.12

dbNSFP v2.0: a database of human non-synonymous SNVs and their functional predictions and annotations. Hum Mutat (2013) 5.11

New and continuing developments at PROSITE. Nucleic Acids Res (2012) 4.41

SMART: recent updates, new developments and status in 2015. Nucleic Acids Res (2014) 4.19

The BioGRID interaction database: 2015 update. Nucleic Acids Res (2014) 4.06

Genenames.org: the HGNC resources in 2013. Nucleic Acids Res (2012) 3.69

Genenames.org: the HGNC resources in 2015. Nucleic Acids Res (2014) 3.54

Comparative genomics of plant-associated Pseudomonas spp.: insights into diversity and inheritance of traits involved in multitrophic interactions. PLoS Genet (2012) 3.46

TIGRFAMs and Genome Properties in 2013. Nucleic Acids Res (2012) 3.45

GeneCodis3: a non-redundant and modular enrichment analysis tool for functional genomics. Nucleic Acids Res (2012) 3.29

Ensembl Genomes 2013: scaling up access to genome-wide data. Nucleic Acids Res (2013) 3.08

GeneMANIA prediction server 2013 update. Nucleic Acids Res (2013) 2.74

PlantTFDB 3.0: a portal for the functional and evolutionary study of plant transcription factors. Nucleic Acids Res (2013) 2.71

OrthoDB: a hierarchical catalog of animal, fungal and bacterial orthologs. Nucleic Acids Res (2012) 2.68

SIFTS: Structure Integration with Function, Taxonomy and Sequences resource. Nucleic Acids Res (2012) 2.61

Transcriptome-wide mapping reveals widespread dynamic-regulated pseudouridylation of ncRNA and mRNA. Cell (2014) 2.58

The landscape of kinase fusions in cancer. Nat Commun (2014) 2.48

A survey of best practices for RNA-seq data analysis. Genome Biol (2016) 2.37

Expanded microbial genome coverage and improved protein family annotation in the COG database. Nucleic Acids Res (2014) 2.37

DGIdb: mining the druggable genome. Nat Methods (2013) 2.26

HAMAP in 2013, new developments in the protein family classification and annotation system. Nucleic Acids Res (2012) 2.16

The 2012 Nucleic Acids Research Database Issue and the online Molecular Biology Database Collection. Nucleic Acids Res (2011) 2.16

PDBe: Protein Data Bank in Europe. Nucleic Acids Res (2013) 2.16

Genomic mining of prokaryotic repressors for orthogonal logic gates. Nat Chem Biol (2013) 1.98

The genome of the recently domesticated crop plant sugar beet (Beta vulgaris). Nature (2013) 1.95

ESTHER, the database of the α/β-hydrolase fold superfamily of proteins: tools to explore diversity of functions. Nucleic Acids Res (2012) 1.88

HAMAP in 2015: updates to the protein family classification and annotation system. Nucleic Acids Res (2014) 1.88

The PRINTS database: a fine-grained protein sequence annotation and analysis resource--its status in 2012. Database (Oxford) (2012) 1.80

The mouse Gene Expression Database (GXD): 2014 update. Nucleic Acids Res (2013) 1.78

Synthetic biology for the directed evolution of protein biocatalysts: navigating sequence space intelligently. Chem Soc Rev (2015) 1.74

EBI metagenomics--a new resource for the analysis and archiving of metagenomic data. Nucleic Acids Res (2013) 1.74

OrthoDB v8: update of the hierarchical catalog of orthologs and the underlying free software. Nucleic Acids Res (2014) 1.73

IUPHAR-DB: updated database content and new features. Nucleic Acids Res (2012) 1.66

AGAPE (Automated Genome Analysis PipelinE) for pan-genome analysis of Saccharomyces cerevisiae. PLoS One (2015) 1.65

Genomic determinants of sporulation in Bacilli and Clostridia: towards the minimal set of sporulation-specific genes. Environ Microbiol (2012) 1.59

Quantitative temporal viromics: an approach to investigate host-pathogen interaction. Cell (2014) 1.58

Standardized description of scientific evidence using the Evidence Ontology (ECO). Database (Oxford) (2014) 1.57

UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches. Bioinformatics (2014) 1.56

Manual GO annotation of predictive protein signatures: the InterPro approach to GO curation. Database (Oxford) (2012) 1.56

The genome and life-stage specific transcriptomes of Globodera pallida elucidate key aspects of plant parasitism by a cyst nematode. Genome Biol (2014) 1.51

The Structure-Function Linkage Database. Nucleic Acids Res (2013) 1.50

DECIPHER: web-based, community resource for clinical interpretation of rare variants in developmental disorders. Hum Mol Genet (2012) 1.50

Improving microbial genome annotations in an integrated database context. PLoS One (2013) 1.48

Finding the missing honey bee genes: lessons learned from a genome upgrade. BMC Genomics (2014) 1.46

PLAZA 3.0: an access point for plant comparative genomics. Nucleic Acids Res (2014) 1.44

The Aspergillus Genome Database: multispecies curation and incorporation of RNA-Seq data to improve structural gene annotations. Nucleic Acids Res (2013) 1.44

The Capsaspora genome reveals a complex unicellular prehistory of animals. Nat Commun (2013) 1.39

An evaluation of the accuracy and speed of metagenome analysis tools. Sci Rep (2016) 1.36

INDIGO - INtegrated data warehouse of microbial genomes with examples from the red sea extremophiles. PLoS One (2013) 1.36

Integrating pathways of Parkinson's disease in a molecular interaction map. Mol Neurobiol (2013) 1.34

Unique features of the loblolly pine (Pinus taeda L.) megagenome revealed through sequence annotation. Genetics (2014) 1.32

The challenge of increasing Pfam coverage of the human proteome. Database (Oxford) (2013) 1.30

The Genome Database for Rosaceae (GDR): year 10 update. Nucleic Acids Res (2013) 1.28

Screening of candidate regulators for cellulase and hemicellulase production in Trichoderma reesei and identification of a factor essential for cellulase production. Biotechnol Biofuels (2014) 1.26

CAMP: Collection of sequences and structures of antimicrobial peptides. Nucleic Acids Res (2013) 1.24

Phylogenetic patterns of emergence of new genes support a model of frequent de novo evolution. BMC Genomics (2013) 1.23

TFClass: an expandable hierarchical classification of human transcription factors. Nucleic Acids Res (2012) 1.22

The genome of the alga-associated marine flavobacterium Formosa agariphila KMM 3901T reveals a broad potential for degradation of algal polysaccharides. Appl Environ Microbiol (2013) 1.22

The genomes of two key bumblebee species with primitive eusocial organization. Genome Biol (2015) 1.21

Genome of the human hookworm Necator americanus. Nat Genet (2014) 1.18

A chaperome subnetwork safeguards proteostasis in aging and neurodegenerative disease. Cell Rep (2014) 1.17

A genome scale metabolic network for rice and accompanying analysis of tryptophan, auxin and serotonin biosynthesis regulation under biotic stress. Rice (N Y) (2013) 1.16

Giving structure to the biofilm matrix: an overview of individual strategies and emerging common themes. FEMS Microbiol Rev (2015) 1.16

The Plasmodiophora brassicae genome reveals insights in its life cycle and ancestry of chitin synthases. Sci Rep (2015) 1.15

Re-annotation of the CAZy genes of Trichoderma reesei and transcription in the presence of lignocellulosic substrates. Microb Cell Fact (2012) 1.14

A method for WD40 repeat detection and secondary structure prediction. PLoS One (2013) 1.14

How we feel: ion channel partnerships that detect mechanical inputs and give rise to touch and pain perception. Neuron (2012) 1.14

Skylign: a tool for creating informative, interactive logos representing sequence alignments and profile hidden Markov models. BMC Bioinformatics (2014) 1.14

Functional assignment of metagenomic data: challenges and applications. Brief Bioinform (2012) 1.13

Genomic and secretomic analyses reveal unique features of the lignocellulolytic enzyme system of Penicillium decumbens. PLoS One (2013) 1.12

Single-particle EM reveals extensive conformational variability of the Ltn1 E3 ligase. Proc Natl Acad Sci U S A (2013) 1.11

A First Insight into the Genome of the Filter-Feeder Mussel Mytilus galloprovincialis. PLoS One (2016) 1.11

SWI/SNF complex in disorder: SWItching from malignancies to intellectual disability. Epigenetics (2012) 1.10

CellBase, a comprehensive collection of RESTful web services for retrieving relevant biological information from heterogeneous sources. Nucleic Acids Res (2012) 1.09

Iron-binding haemerythrin RING ubiquitin ligases regulate plant iron responses and accumulation. Nat Commun (2013) 1.09

CaPSID: a bioinformatics platform for computational pathogen sequence identification in human genomes and transcriptomes. BMC Bioinformatics (2012) 1.08

OrtholugeDB: a bacterial and archaeal orthology resource for improved comparative genomic analysis. Nucleic Acids Res (2012) 1.08

Chemical pulldown reveals dynamic pseudouridylation of the mammalian transcriptome. Nat Chem Biol (2015) 1.07

Protein function prediction by massive integration of evolutionary analyses and multiple data sources. BMC Bioinformatics (2013) 1.07

Exceptionally widespread nanomachines composed of type IV pilins: the prokaryotic Swiss Army knives. FEMS Microbiol Rev (2014) 1.05

Genome analysis of a major urban malaria vector mosquito, Anopheles stephensi. Genome Biol (2014) 1.05

Mechanistic diversity of radical S-adenosylmethionine (SAM)-dependent methylation. J Biol Chem (2014) 1.05

The human phosphatase interactome: An intricate family portrait. FEBS Lett (2012) 1.04

Identification of the Hevea brasiliensis AP2/ERF superfamily by RNA sequencing. BMC Genomics (2013) 1.04

Comparative analyses of nonpathogenic, opportunistic, and totally pathogenic mycobacteria reveal genomic and biochemical variabilities and highlight the survival attributes of Mycobacterium tuberculosis. MBio (2014) 1.04

The MG-RAST metagenomics database and portal in 2015. Nucleic Acids Res (2015) 1.03

Discovery of a new ATP-binding motif involved in peptidic azoline biosynthesis. Nat Chem Biol (2014) 1.03

Simultaneous transcriptome analysis of Sorghum and Bipolaris sorghicola by using RNA-seq in combination with de novo transcriptome assembly. PLoS One (2013) 1.03

Engineering ecosystems and synthetic ecologies. Mol Biosyst (2012) 1.02

RhesusBase: a knowledgebase for the monkey research community. Nucleic Acids Res (2012) 1.02

MBGD update 2015: microbial genome database for flexible ortholog analysis utilizing a diverse set of genomic data. Nucleic Acids Res (2014) 1.01

SS18 together with animal-specific factors defines human BAF-type SWI/SNF complexes. PLoS One (2012) 1.01

Articles cited by this

Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet (2000) 336.52

The Pfam protein families database. Nucleic Acids Res (2009) 37.98

KEGG for representation and analysis of molecular networks involving diseases and drugs. Nucleic Acids Res (2009) 28.60

The ENZYME database in 2000. Nucleic Acids Res (2000) 23.85

InterProScan: protein domains identifier. Nucleic Acids Res (2005) 18.82

The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data. Nucleic Acids Res (2006) 13.44

A new bioinformatics analysis tools framework at EMBL-EBI. Nucleic Acids Res (2010) 11.42

Ongoing and future developments at the Universal Protein Resource. Nucleic Acids Res (2010) 11.32

PRINTS and its automatic supplement, prePRINTS. Nucleic Acids Res (2003) 10.01

The IntAct molecular interaction database in 2010. Nucleic Acids Res (2009) 9.85

SMART 6: recent updates and new developments. Nucleic Acids Res (2008) 9.80

PANTHER version 7: improved phylogenetic trees, orthologs and collaboration with the Gene Ontology Consortium. Nucleic Acids Res (2009) 9.14

TIGRFAMs and Genome Properties: tools for the assignment of molecular function and biological process in prokaryotic genomes. Nucleic Acids Res (2006) 8.89

The ProDom database of protein domain families: more emphasis on 3D. Nucleic Acids Res (2005) 7.66

PROSITE, a protein domain database for functional characterization and annotation. Nucleic Acids Res (2009) 7.47

The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res (2009) 7.33

Integrating biological data--the Distributed Annotation System. BMC Bioinformatics (2008) 6.56

MEROPS: the peptidase database. Nucleic Acids Res (2009) 5.33

Enzyme-specific profiles for genome annotation: PRIAM. Nucleic Acids Res (2003) 5.10

SUPERFAMILY 1.75 including a domain-centric gene ontology method. Nucleic Acids Res (2010) 3.34

HAMAP: a database of completely sequenced microbial proteome sets and manually curated microbial protein families in UniProtKB/Swiss-Prot. Nucleic Acids Res (2008) 3.29

Formalization of taxon-based constraints to detect inconsistencies in annotation and ontology development. BMC Bioinformatics (2010) 3.10

PIRSF family classification system for protein functional and evolutionary analysis. Evol Bioinform Online (2007) 3.03

UniPathway: a resource for the exploration and annotation of metabolic pathways. Nucleic Acids Res (2011) 2.45

Gene3D: merging structure and function for a Thousand genomes. Nucleic Acids Res (2009) 2.34

Reactome knowledgebase of human biological pathways and processes. Methods Mol Biol (2011) 1.91

The InterPro BioMart: federated query and web service access to the InterPro Resource. Database (Oxford) (2011) 1.75

Dasty3, a WEB framework for DAS. Bioinformatics (2011) 1.67

Articles by these authors

Initial sequencing and comparative analysis of the mouse genome. Nature (2002) 96.15

A method and server for predicting damaging missense mutations. Nat Methods (2010) 78.53

The Pfam protein families database. Nucleic Acids Res (2004) 56.46

The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res (2003) 52.80

Human non-synonymous SNPs: server and survey. Nucleic Acids Res (2002) 50.45

Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature (2002) 45.19

A human gut microbial gene catalogue established by metagenomic sequencing. Nature (2010) 43.63

miRBase: microRNA sequences, targets and gene nomenclature. Nucleic Acids Res (2006) 39.25

Multiple sequence alignment with the Clustal series of programs. Nucleic Acids Res (2003) 38.75

The Pfam protein families database. Nucleic Acids Res (2009) 37.98

Genome sequence of the human malaria parasite Plasmodium falciparum. Nature (2002) 37.89

Pfam: clans, web tools and services. Nucleic Acids Res (2006) 34.83

The Pfam protein families database. Nucleic Acids Res (2011) 33.46

The Pfam protein families database. Nucleic Acids Res (2007) 30.53

UniProt: the Universal Protein knowledgebase. Nucleic Acids Res (2004) 29.05

Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol (2011) 28.61

The Carbohydrate-Active EnZymes database (CAZy): an expert resource for Glycogenomics. Nucleic Acids Res (2008) 27.83

Comparative metagenomics of microbial communities. Science (2005) 25.88

Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Res (2005) 25.49

InterPro: the integrative protein signature database. Nucleic Acids Res (2008) 25.07

Recent improvements to the SMART domain-based sequence annotation resource. Nucleic Acids Res (2002) 25.06

The InterPro Database, 2003 brings increased coverage and new features. Nucleic Acids Res (2003) 24.72

Genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature (2004) 24.40

Enterotypes of the human gut microbiome. Nature (2011) 24.36

Comparative assessment of large-scale data sets of protein-protein interactions. Nature (2002) 24.25

The Universal Protein Resource (UniProt). Nucleic Acids Res (2005) 23.66

Rfam: an RNA family database. Nucleic Acids Res (2003) 22.93

The Universal Protein Resource (UniProt): an expanding universe of protein information. Nucleic Acids Res (2006) 22.70

Pfam: the protein families database. Nucleic Acids Res (2013) 22.48

PANTHER: a library of protein families and subfamilies indexed by function. Genome Res (2003) 21.64

Proteome survey reveals modularity of the yeast cell machinery. Nature (2006) 20.77

STRING 8--a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res (2008) 20.62

The genome sequence of the malaria mosquito Anopheles gambiae. Science (2002) 20.36

SMART 4.0: towards genomic data integration. Nucleic Acids Res (2004) 19.37

The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology. Nucleic Acids Res (2004) 18.75

The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic Acids Res (2010) 18.73

STRING v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Res (2012) 18.26

InterPro, progress and status in 2005. Nucleic Acids Res (2005) 17.53