InterProScan 5: genome-scale protein function classification.

PubWeight™: 8.52‹?› | Rank: Top 0.1%

🔗 View Article (PMC 3998142)

Published in Bioinformatics on January 21, 2014

Authors

Philip Jones1, David Binns, Hsin-Yu Chang, Matthew Fraser, Weizhong Li, Craig McAnulla, Hamish McWilliam, John Maslen, Alex Mitchell, Gift Nuka, Sebastien Pesseat, Antony F Quinn, Amaia Sangrador-Vegas, Maxim Scheremetjew, Siew-Yit Yong, Rodrigo Lopez, Sarah Hunter

Author Affiliations

1: European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton CB10 1SD and Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK.

Articles citing this

(truncated to the top 100)

The InterPro protein families database: the classification resource after 15 years. Nucleic Acids Res (2014) 6.00

The standard operating procedure of the DOE-JGI Microbial Genome Annotation Pipeline (MGAP v.4). Stand Genomic Sci (2015) 2.55

Evolution of Darwin's finches and their beaks revealed by genome sequencing. Nature (2015) 2.50

The EMBL-EBI bioinformatics web and programmatic tools framework. Nucleic Acids Res (2015) 2.46

Transcriptional diversity during lineage commitment of human blood progenitors. Science (2014) 1.93

Structural genomic changes underlie alternative reproductive strategies in the ruff (Philomachus pugnax). Nat Genet (2015) 1.89

HAMAP in 2015: updates to the protein family classification and annotation system. Nucleic Acids Res (2014) 1.88

The genome sequences of Arachis duranensis and Arachis ipaensis, the diploid ancestors of cultivated peanut. Nat Genet (2016) 1.52

Massive expansion of Ubiquitination-related gene families within the Chlamydiae. Mol Biol Evol (2014) 1.47

WormBase 2016: expanding to enable helminth genomic research. Nucleic Acids Res (2015) 1.42

Genomic innovations, transcriptional plasticity and gene loss underlying the evolution and divergence of two highly polyphagous and invasive Helicoverpa pest species. BMC Biol (2017) 1.39

Enhanced annotations and features for comparing thousands of Pseudomonas genomes in the Pseudomonas genome database. Nucleic Acids Res (2015) 1.35

Whole genome de novo assemblies of three divergent strains of rice, Oryza sativa, document novel gene space of aus and indica. Genome Biol (2014) 1.24

Improved Ribosome-Footprint and mRNA Measurements Provide Insights into Dynamics and Regulation of Yeast Translation. Cell Rep (2016) 1.22

The genomic basis of parasitism in the Strongyloides clade of nematodes. Nat Genet (2016) 1.09

InterPro in 2017-beyond protein family and domain annotations. Nucleic Acids Res (2016) 1.08

The Chlamydomonas genome project: a decade on. Trends Plant Sci (2014) 1.08

IMG/M: integrated genome and metagenome comparative data analysis system. Nucleic Acids Res (2016) 1.06

Genome of Rhodnius prolixus, an insect vector of Chagas disease, reveals unique adaptations to hematophagy and parasite infection. Proc Natl Acad Sci U S A (2015) 1.05

Whole-Genome Resequencing Reveals Extensive Natural Variation in the Model Green Alga Chlamydomonas reinhardtii. Plant Cell (2015) 1.04

Resources for Genetic and Genomic Analysis of Emerging Pathogen Acinetobacter baumannii. J Bacteriol (2015) 1.04

The genome of the yellow potato cyst nematode, Globodera rostochiensis, reveals insights into the basis of parasitism and virulence. Genome Biol (2016) 1.02

Saccharina genomes provide novel insight into kelp biology. Nat Commun (2015) 1.02

Melanisation of Aspergillus terreus-Is Butyrolactone I Involved in the Regulation of Both DOPA and DHN Types of Pigments in Submerged Culture? Microorganisms (2017) 0.99

Plasmodium knowlesi genome sequences from clinical isolates reveal extensive genomic dimorphism. PLoS One (2015) 0.98

Reconstructing a comprehensive transcriptome assembly of a white-pupal translocated strain of the pest fruit fly Bactrocera cucurbitae. Gigascience (2015) 0.97

Asgard archaea illuminate the origin of eukaryotic cellular complexity. Nature (2017) 0.97

The genome sequence of Pseudoplusia includens single nucleopolyhedrovirus and an analysis of p26 gene evolution in the baculoviruses. BMC Genomics (2015) 0.95

A less-biased analysis of metalloproteins reveals novel zinc coordination geometries. Proteins (2015) 0.95

A high-quality carrot genome assembly provides new insights into carotenoid accumulation and asterid genome evolution. Nat Genet (2016) 0.94

Genomic legacy of the African cheetah, Acinonyx jubatus. Genome Biol (2015) 0.93

Complete genome sequence and comparative genomic analysis of Mycobacterium massiliense JCM 15300 in the Mycobacterium abscessus group reveal a conserved genomic island MmGI-1 related to putative lipid metabolism. PLoS One (2014) 0.93

Three novel virophage genomes discovered from Yellowstone Lake metagenomes. J Virol (2014) 0.93

Triticeae resources in Ensembl Plants. Plant Cell Physiol (2014) 0.92

Genome assembly and geospatial phylogenomics of the bed bug Cimex lectularius. Nat Commun (2016) 0.92

The de novo genome assembly and annotation of a female domestic dromedary of North African origin. Mol Ecol Resour (2015) 0.92

ANISEED 2015: a digital framework for the comparative developmental biology of ascidians. Nucleic Acids Res (2015) 0.92

Stem cells and fluid flow drive cyst formation in an invertebrate excretory organ. Elife (2015) 0.91

The EBI Search engine: providing search and retrieval functionality for biological data from EMBL-EBI. Nucleic Acids Res (2015) 0.91

Helminth.net: expansions to Nematode.net and an introduction to Trematode.net. Nucleic Acids Res (2014) 0.91

Conservation analysis of the CydX protein yields insights into small protein identification and evolution. BMC Genomics (2014) 0.90

PlanMine - a mineable resource of planarian biology and biodiversity. Nucleic Acids Res (2015) 0.90

Sequencing of the complete genome of an araphid pennate diatom Synedra acus subsp. radians from Lake Baikal. Dokl Biochem Biophys (2015) 0.89

Genome sequence of Anoxybacillus ayderensis AB04(T) isolated from the Ayder hot spring in Turkey. Stand Genomic Sci (2015) 0.88

Functional and phylogenetic characterization of proteins detected in various nematode intestinal compartments. Mol Cell Proteomics (2015) 0.88

Rerouting Cellular Electron Flux To Increase the Rate of Biological Methane Production. Appl Environ Microbiol (2015) 0.88

Major bacterial lineages are essentially devoid of CRISPR-Cas viral defence systems. Nat Commun (2016) 0.88

Comparative transcriptomic analysis uncovers the complex genetic network for resistance to Sclerotinia sclerotiorum in Brassica napus. Sci Rep (2016) 0.87

MeSH ORA framework: R/Bioconductor packages to support MeSH over-representation analysis. BMC Bioinformatics (2015) 0.87

Predicting protein functions using incomplete hierarchical labels. BMC Bioinformatics (2015) 0.87

Complete genomes of Hairstreak butterflies, their speciation, and nucleo-mitochondrial incongruence. Sci Rep (2016) 0.87

Metagenomic resolution of microbial functions in deep-sea hydrothermal plumes across the Eastern Lau Spreading Center. ISME J (2015) 0.86

Schistosoma mansoni Egg, Adult Male and Female Comparative Gene Expression Analysis and Identification of Novel Genes by RNA-Seq. PLoS Negl Trop Dis (2015) 0.86

The crown-of-thorns starfish genome as a guide for biocontrol of this coral reef pest. Nature (2017) 0.86

The genome of the sparganosis tapeworm Spirometra erinaceieuropaei isolated from the biopsy of a migrating brain lesion. Genome Biol (2014) 0.86

InsectBase: a resource for insect genomes and transcriptomes. Nucleic Acids Res (2015) 0.86

Speciation in Cloudless Sulphurs Gleaned from Complete Genomes. Genome Biol Evol (2016) 0.86

Improvement of barley genome annotations by deciphering the Haruna Nijo genome. DNA Res (2015) 0.86

The genome sequence of the outbreeding globe artichoke constructed de novo incorporating a phase-aware low-pass sequencing strategy of F1 progeny. Sci Rep (2016) 0.86

United in diversity: mechanosensitive ion channels in plants. Annu Rev Plant Biol (2014) 0.86

The power of single molecule real-time sequencing technology in the de novo assembly of a eukaryotic genome. Sci Rep (2015) 0.86

PlantTFDB 4.0: toward a central hub for transcription factors and regulatory interactions in plants. Nucleic Acids Res (2016) 0.86

Divergent and Convergent Evolution of Fungal Pathogenicity. Genome Biol Evol (2016) 0.85

Transcriptomic Complexity of Aspergillus terreus Velvet Gene Family under the Influence of Butyrolactone I. Microorganisms (2017) 0.85

Transcriptomic analysis reveals the roles of microtubule-related genes and transcription factors in fruit length regulation in cucumber (Cucumis sativus L.). Sci Rep (2015) 0.85

Legume information system (LegumeInfo.org): a key component of a set of federated data resources for the legume family. Nucleic Acids Res (2015) 0.85

IMA Genome-F 4: Draft genome sequences of Chrysoporthe austroafricana, Diplodia scrobiculata, Fusarium nygamai, Leptographium lundbergii, Limonomyces culmigenus, Stagonosporopsis tanaceti, and Thielaviopsis punctulata. IMA Fungus (2015) 0.85

Comparative genomics and metabolic profiling of the genus Lysobacter. BMC Genomics (2015) 0.85

WormBase ParaSite - a comprehensive resource for helminth genomics. Mol Biochem Parasitol (2016) 0.85

Repeated replacement of an intrabacterial symbiont in the tripartite nested mealybug symbiosis. Proc Natl Acad Sci U S A (2016) 0.85

Protein domain architectures provide a fast, efficient and scalable alternative to sequence-based methods for comparative functional genomics. F1000Res (2016) 0.84

Draft genome of the Arabidopsis thaliana phyllosphere bacterium, Williamsia sp. ARP1. Stand Genomic Sci (2016) 0.84

The genome of Leishmania panamensis: insights into genomics of the L. (Viannia) subgenus. Sci Rep (2015) 0.84

Negative example selection for protein function prediction: the NoGO database. PLoS Comput Biol (2014) 0.84

Comparative exomics of Phalaris cultivars under salt stress. BMC Genomics (2014) 0.84

Origin and evolution of lysyl oxidases. Sci Rep (2015) 0.84

A hybrid non-ribosomal peptide/polyketide synthetase containing fatty-acyl ligase (FAAL) synthesizes the β-amino fatty acid lipopeptides puwainaphycins in the Cyanobacterium Cylindrospermum alatosporum. PLoS One (2014) 0.84

Analysis of the role of the LH92_11085 gene of a biofilm hyper-producing Acinetobacter baumannii strain on biofilm formation and attachment to eukaryotic cells. Virulence (2016) 0.84

Whole genome sequence and manual annotation of Clostridium autoethanogenum, an industrially relevant bacterium. BMC Genomics (2015) 0.84

Degeneration of the nonrecombining regions in the mating-type chromosomes of the anther-smut fungi. Mol Biol Evol (2014) 0.84

Integrative structural annotation of de novo RNA-Seq provides an accurate reference gene set of the enormous genome of the onion (Allium cepa L.). DNA Res (2014) 0.84

Revisiting Vitis vinifera Subtilase Gene Family: A Possible Role in Grapevine Resistance against Plasmopara viticola. Front Plant Sci (2016) 0.84

Co-localisation of the blackleg resistance genes Rlm2 and LepR3 on Brassica napus chromosome A10. BMC Plant Biol (2014) 0.84

Polymyxin Resistance in Acinetobacter baumannii: Genetic Mutations and Transcriptomic Changes in Response to Clinically Relevant Dosage Regimens. Sci Rep (2016) 0.83

The genome of Onchocerca volvulus, agent of river blindness. Nat Microbiol (2016) 0.83

Gene family expansions and contractions are associated with host range in plant pathogens of the genus Colletotrichum. BMC Genomics (2016) 0.83

The genetic basis for ecological adaptation of the Atlantic herring revealed by genome sequencing. Elife (2016) 0.83

Whole-genome sequences of 13 endophytic bacteria isolated from shrub willow (salix) grown in geneva, new york. Genome Announc (2014) 0.83

Skipper genome sheds light on unique phenotypic traits and phylogeny. BMC Genomics (2015) 0.83

A New Chicken Genome Assembly Provides Insight into Avian Genome Structure. G3 (Bethesda) (2016) 0.82

The multicellularity genes of dictyostelid social amoebas. Nat Commun (2016) 0.82

An improved genome assembly uncovers prolific tandem repeats in Atlantic cod. BMC Genomics (2017) 0.82

HPMCD: the database of human microbial communities from metagenomic datasets and microbial reference genomes. Nucleic Acids Res (2015) 0.82

Defining the gene repertoire and spatiotemporal expression profiles of adhesion G protein-coupled receptors in zebrafish. BMC Genomics (2015) 0.82

Ebolavirus comparative genomics. FEMS Microbiol Rev (2015) 0.82

Why Close a Bacterial Genome? The Plasmid of Alteromonas Macleodii HOT1A3 is a Vector for Inter-Specific Transfer of a Flexible Genomic Island. Front Microbiol (2016) 0.82

Longitudinal genomic surveillance of Plasmodium falciparum malaria parasites reveals complex genomic architecture of emerging artemisinin resistance. Genome Biol (2017) 0.82

Genetic Diversity of Clostridium sporogenes PA 3679 Isolates Obtained from Different Sources as Resolved by Pulsed-Field Gel Electrophoresis and High-Throughput Sequencing. Appl Environ Microbiol (2015) 0.82

The Psp system of Mycobacterium tuberculosis integrates envelope stress-sensing and envelope-preserving functions. Mol Microbiol (2015) 0.82

Red clover (Trifolium pratense L.) draft genome provides a platform for trait improvement. Sci Rep (2015) 0.82

Articles cited by this

Basic local alignment search tool. J Mol Biol (1990) 659.07

Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet (2000) 336.52

Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol (2001) 66.87

SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat Methods (2011) 33.90

The Pfam protein families database. Nucleic Acids Res (2011) 33.46

The ENZYME database in 2000. Nucleic Acids Res (2000) 23.85

PIRSF: family classification system at the Protein Information Resource. Nucleic Acids Res (2004) 19.62

InterProScan: protein domains identifier. Nucleic Acids Res (2005) 18.82

Reorganizing the protein space at the Universal Protein Resource (UniProt). Nucleic Acids Res (2011) 18.50

A combined transmembrane topology and signal peptide prediction method. J Mol Biol (2004) 15.77

InterPro in 2011: new developments in the family and domain prediction database. Nucleic Acids Res (2011) 13.45

A new bioinformatics analysis tools framework at EMBL-EBI. Nucleic Acids Res (2010) 11.42

SMART 7: recent updates to the protein domain annotation resource. Nucleic Acids Res (2011) 9.15

A new generation of homology search tools based on probabilistic inference. Genome Inform (2009) 8.93

PANTHER in 2013: modeling the evolution of gene function, and other gene attributes, in the context of phylogenetic trees. Nucleic Acids Res (2012) 8.54

The ProDom database of protein domain families: more emphasis on 3D. Nucleic Acids Res (2005) 7.66

The genome of woodland strawberry (Fragaria vesca). Nat Genet (2010) 5.86

New and continuing developments at PROSITE. Nucleic Acids Res (2012) 4.41

TIGRFAMs and Genome Properties in 2013. Nucleic Acids Res (2012) 3.45

SUPERFAMILY 1.75 including a domain-centric gene ontology method. Nucleic Acids Res (2010) 3.34

Sequence analysis of the genome of an oil-bearing tree, Jatropha curcas L. DNA Res (2010) 2.19

The genome sequence of the leaf-cutter ant Atta cephalotes reveals insights into its obligate symbiotic lifestyle. PLoS Genet (2011) 2.16

HAMAP in 2013, new developments in the protein family classification and annotation system. Nucleic Acids Res (2012) 2.16

The PRINTS database: a fine-grained protein sequence annotation and analysis resource--its status in 2012. Database (Oxford) (2012) 1.80

Gene3D: a domain-based resource for comparative genomics, functional annotation and protein network analysis. Nucleic Acids Res (2011) 1.65

Molecular network analysis of diseases and drugs in KEGG. Methods Mol Biol (2013) 1.27

Articles by these authors

Multiple sequence alignment with the Clustal series of programs. Nucleic Acids Res (2003) 38.75

UniProt: the Universal Protein knowledgebase. Nucleic Acids Res (2004) 29.05

Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol (2011) 28.61

InterPro: the integrative protein signature database. Nucleic Acids Res (2008) 25.07

The InterPro Database, 2003 brings increased coverage and new features. Nucleic Acids Res (2003) 24.72

The Universal Protein Resource (UniProt). Nucleic Acids Res (2005) 23.66

The Universal Protein Resource (UniProt): an expanding universe of protein information. Nucleic Acids Res (2006) 22.70

The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology. Nucleic Acids Res (2004) 18.75

InterPro, progress and status in 2005. Nucleic Acids Res (2005) 17.53

The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families. PLoS Biol (2007) 13.99

InterPro in 2011: new developments in the family and domain prediction database. Nucleic Acids Res (2011) 13.45

The Gene Ontology Annotation (GOA) project: implementation of GO in SWISS-PROT, TrEMBL, and InterPro. Genome Res (2003) 12.81

New developments in the InterPro database. Nucleic Acids Res (2007) 12.49

The EMBL Nucleotide Sequence Database. Nucleic Acids Res (2002) 12.05

UniProt archive. Bioinformatics (2004) 11.92

A new bioinformatics analysis tools framework at EMBL-EBI. Nucleic Acids Res (2010) 11.42

The GOA database in 2009--an integrated Gene Ontology Annotation resource. Nucleic Acids Res (2008) 10.21

EMBL Nucleotide Sequence Database in 2006. Nucleic Acids Res (2006) 9.72

CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics (2012) 8.32

Broadening the horizon--level 2.5 of the HUPO-PSI format for molecular interactions. BMC Biol (2007) 8.03

The EMBL Nucleotide Sequence Database. Nucleic Acids Res (2005) 7.18

The EMBL Nucleotide Sequence Database: major new developments. Nucleic Acids Res (2003) 7.17

The EMBL Nucleotide Sequence Database. Nucleic Acids Res (2004) 6.72

CD-HIT Suite: a web server for clustering and comparing biological sequences. Bioinformatics (2010) 6.68

An evaluation of GO annotation retrieval for BioCreAtIvE and GOA. BMC Bioinformatics (2005) 6.58

The EBI SRS server-new features. Bioinformatics (2002) 6.38

Analysis Tool Web Services from the EMBL-EBI. Nucleic Acids Res (2013) 5.29

Petabyte-scale innovations at the European Nucleotide Archive. Nucleic Acids Res (2008) 5.21

EMBL Nucleotide Sequence Database: developments in 2005. Nucleic Acids Res (2006) 5.13

Improvements to services at the European Nucleotide Archive. Nucleic Acids Res (2009) 5.00

FFAS03: a server for profile--profile sequence alignments. Nucleic Acids Res (2005) 4.99

Artificial and natural duplicates in pyrosequencing reads of metagenomic data. BMC Bioinformatics (2010) 4.89

BioCatalogue: a universal catalogue of web services for the life sciences. Nucleic Acids Res (2010) 4.67

Web services at the European bioinformatics institute. Nucleic Acids Res (2007) 4.48

PRIDE: new developments and new datasets. Nucleic Acids Res (2007) 4.42

Community cyberinfrastructure for Advanced Microbial Ecology Research and Analysis: the CAMERA resource. Nucleic Acids Res (2010) 4.39

Fast and efficient searching of biological data resources--using EB-eye. Brief Bioinform (2010) 4.19

PRIDE: a public repository of protein and peptide identifications for the proteomics community. Nucleic Acids Res (2006) 4.19

Priorities for nucleotide trace, sequence and annotation data capture at the Ensembl Trace Archive and the EMBL Nucleotide Sequence Database. Nucleic Acids Res (2007) 3.84

Detection of large numbers of novel sequences in the metatranscriptomes of complex marine microbial communities. PLoS One (2008) 3.83

Web services at the European Bioinformatics Institute-2009. Nucleic Acids Res (2009) 3.78

The IMGT/HLA database. Nucleic Acids Res (2012) 3.61

WebMGA: a customizable web server for fast metagenomic sequence analysis. BMC Genomics (2011) 3.39

The European Bioinformatics Institute's data resources. Nucleic Acids Res (2003) 3.34

Dasty2, an Ajax protein DAS client. Bioinformatics (2008) 3.26

The EBI SRS server--recent developments. Bioinformatics (2002) 3.04

The IMGT/HLA database. Nucleic Acids Res (2010) 2.89

Bioinformatics Training Network (BTN): a community resource for bioinformatics trainers. Brief Bioinform (2011) 2.79

Prognostic significance of [18F]-misonidazole positron emission tomography-detected tumor hypoxia in patients with advanced head and neck cancer randomly assigned to chemoradiation with or without tirapazamine: a substudy of Trans-Tasman Radiation Oncology Group Study 98.02. J Clin Oncol (2006) 2.68

Facing growth in the European Nucleotide Archive. Nucleic Acids Res (2012) 2.67

Gene identification and protein classification in microbial metagenomic sequence data via incremental clustering. BMC Bioinformatics (2008) 2.67

InterPro: an integrated documentation resource for protein families, domains and functional sites. Brief Bioinform (2002) 2.66

The IMGT/HLA database. Nucleic Acids Res (2008) 2.22

Identification of ribosomal RNA genes in metagenomic fragments. Bioinformatics (2009) 2.21