count: 11 list: - accessibility: Open access additionDate: '2022-09-05T20:29:10.393972Z' biotoolsCURIE: biotools:netsurfp-3.0 biotoolsID: netsurfp-3.0 collectionID: [] community: null confidence_flag: tool cost: Free of charge credit: - email: paolo.marcatili@gmail.com fundrefid: null gridid: null name: Paolo Marcatili note: null orcidid: https://orcid.org/0000-0003-2615-5695 rorid: null typeEntity: Person typeRole: [] url: null - email: null fundrefid: null gridid: null name: Jeppe Hallgren note: null orcidid: https://orcid.org/0000-0001-5377-2643 rorid: null typeEntity: null typeRole: [] url: null - email: null fundrefid: null gridid: null name: Henrik Nielsen note: null orcidid: https://orcid.org/0000-0002-9412-9643 rorid: null typeEntity: null typeRole: [] url: null - email: null fundrefid: null gridid: null name: Ole Winther note: null orcidid: https://orcid.org/0000-0002-1966-3205 rorid: null typeEntity: null typeRole: [] url: null - email: null fundrefid: null gridid: null name: Morten Nielsen note: null orcidid: https://orcid.org/0000-0001-7885-4311 rorid: null typeEntity: null typeRole: [] url: null - email: null fundrefid: null gridid: null name: Bent Petersen note: null orcidid: https://orcid.org/0000-0002-2472-8317 rorid: null typeEntity: null typeRole: [] url: null - email: null fundrefid: null gridid: null name: Magnus Haraldson Høie note: null orcidid: https://orcid.org/0000-0003-2567-758X rorid: null typeEntity: null typeRole: [] url: null - email: null fundrefid: null gridid: null name: Erik Nicolas Kiehl note: null orcidid: null rorid: null typeEntity: null typeRole: [] url: null description: Accurate and fast prediction of protein structural features by protein language models and deep learning. documentation: [] download: [] editPermission: authors: [] type: private elixirCommunity: [] elixirNode: [] elixirPlatform: [] elixir_badge: 0 function: - cmd: null input: - data: term: Protein sequence uri: http://edamontology.org/data_2976 format: - term: FASTA uri: http://edamontology.org/format_1929 note: null operation: - term: Protein secondary structure comparison uri: http://edamontology.org/operation_2488 - term: Protein secondary structure prediction uri: http://edamontology.org/operation_0267 - term: Protein geometry calculation uri: http://edamontology.org/operation_0249 output: [] homepage: https://services.healthtech.dtu.dk/service.php?NetSurfP-3.0 homepage_status: 0 language: [] lastUpdate: '2022-09-05T20:29:10.396930Z' license: null link: - note: null type: - Other url: https://dtu.biolib.com/nsp3 maturity: null name: NetSurfP-3.0 operatingSystem: - Mac - Linux - Windows otherID: [] owner: Jennifer publication: - doi: 10.1093/NAR/GKAC439 metadata: abstract: © 2022 The Author(s). Published by Oxford University Press on behalf of Nucleic Acids Research.Recent advances in machine learning and natural language processing have made it possible to profoundly advance our ability to accurately predict protein structures and their functions. While such improvements are significantly impacting the fields of biology and biotechnology at large, such methods have the downside of high demands in terms of computing power and runtime, hampering their applicability to large datasets. Here, we present NetSurfP-3.0, a tool for predicting solvent accessibility, secondary structure, structural disorder and backbone dihedral angles for each residue of an amino acid sequence. This NetSurfP update exploits recent advances in pre-trained protein language models to drastically improve the runtime of its predecessor by two orders of magnitude, while displaying similar prediction performance. We assessed the accuracy of NetSurfP-3.0 on several independent test datasets and found it to consistently produce state-of-the-art predictions for each of its output features, with a runtime that is up to to 600 times faster than the most commonly available methods performing the same tasks. The tool is freely available as a web server with a user-friendly interface to navigate the results, as well as a standalone downloadable package. authors: - name: Hoie M.H. - name: Kiehl E.N. - name: Petersen B. - name: Nielsen M. - name: Winther O. - name: Nielsen H. - name: Hallgren J. - name: Marcatili P. citationCount: 0 date: '2022-07-05T00:00:00Z' journal: Nucleic Acids Research title: 'NetSurfP-3.0: accurate and fast prediction of protein structural features by protein language models and deep learning' note: null pmcid: PMC9252760 pmid: '35648435' type: [] version: null relation: [] toolType: - Web application topic: - term: Protein secondary structure uri: http://edamontology.org/topic_3542 - term: Small molecules uri: http://edamontology.org/topic_0154 - term: Biotechnology uri: http://edamontology.org/topic_3297 - term: Protein structural motifs and surfaces uri: http://edamontology.org/topic_0166 - term: Structure prediction uri: http://edamontology.org/topic_0082 validated: 0 version: [] - accessibility: Open access additionDate: '2022-05-19T21:53:05.338767Z' biotoolsCURIE: biotools:netsolp biotoolsID: netsolp collectionID: [] community: null confidence_flag: tool cost: Free of charge credit: - email: null fundrefid: null gridid: null name: Alexander Rosenberg Johansen note: null orcidid: https://orcid.org/0000-0002-4993-7916 rorid: null typeEntity: null typeRole: [] url: null - email: null fundrefid: null gridid: null name: Henrik Nielsen note: null orcidid: https://orcid.org/0000-0002-9412-9643 rorid: null typeEntity: null typeRole: [] url: null - email: null fundrefid: null gridid: null name: Jose J Almagro Armenteros note: null orcidid: https://orcid.org/0000-0003-0111-1362 rorid: null typeEntity: null typeRole: [] url: null description: Prediction of solubility and usability of proteins expressed in E. coli documentation: [] download: [] editPermission: authors: [] type: private elixirCommunity: [] elixirNode: [] elixirPlatform: [] elixir_badge: 0 function: - cmd: null input: - data: term: Protein sequence uri: http://edamontology.org/data_2976 format: - term: FASTA uri: http://edamontology.org/format_1929 note: null operation: - term: Protein solubility prediction uri: http://edamontology.org/operation_0409 output: [] homepage: https://services.healthtech.dtu.dk/service.php?NetSolP homepage_status: 0 language: [] lastUpdate: '2022-05-19T21:53:05.341257Z' license: null link: [] maturity: null name: NetSolP operatingSystem: - Mac - Linux - Windows otherID: [] owner: Jennifer publication: - doi: 10.1093/BIOINFORMATICS/BTAB801 metadata: abstract: '© The Author(s) 2021. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.MOTIVATION: Solubility and expression levels of proteins can be a limiting factor for large-scale studies and industrial production. By determining the solubility and expression directly from the protein sequence, the success rate of wet-lab experiments can be increased. RESULTS: In this study, we focus on predicting the solubility and usability for purification of proteins expressed in Escherichia coli directly from the sequence. Our model NetSolP is based on deep learning protein language models called transformers and we show that it achieves state-of-the-art performance and improves extrapolation across datasets. As we find current methods are built on biased datasets, we curate existing datasets by using strict sequence-identity partitioning and ensure that there is minimal bias in the sequences. AVAILABILITY AND IMPLEMENTATION: The predictor and data are available at https://services.healthtech.dtu.dk/service.php?NetSolP and the open-sourced code is available at https://github.com/tvinet/NetSolP-1.0. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.' authors: - name: Thumuluri V. - name: Martiny H.-M. - name: Almagro Armenteros J.J. - name: Salomon J. - name: Nielsen H. - name: Johansen A.R. citationCount: 0 date: '2022-01-27T00:00:00Z' journal: Bioinformatics (Oxford, England) title: 'NetSolP: predicting protein solubility in Escherichia coli using language models' note: null pmcid: null pmid: '34849581' type: [] version: null relation: [] toolType: - Web application topic: - term: Gene expression uri: http://edamontology.org/topic_0203 - term: Protein expression uri: http://edamontology.org/topic_0108 - term: Literature and language uri: http://edamontology.org/topic_3068 validated: 0 version: [] - accessibility: null additionDate: '2015-12-17T14:23:00Z' biotoolsCURIE: biotools:signalp biotoolsID: signalp collectionID: - CBS community: null confidence_flag: null cost: Free of charge (with restrictions) credit: - email: null fundrefid: null gridid: null name: TN Petersen note: null orcidid: null rorid: null typeEntity: Person typeRole: - Developer url: null - email: null fundrefid: null gridid: null name: CBS note: null orcidid: null rorid: null typeEntity: Institute typeRole: - Provider url: null - email: hnielsen@cbs.dtu.dk fundrefid: null gridid: null name: Henrik Nielsen note: null orcidid: http://orcid.org/0000-0002-9412-9643 rorid: null typeEntity: null typeRole: - Developer url: null - email: hnielsen@cbs.dtu.dk fundrefid: null gridid: null name: Henrik Nielsen note: null orcidid: http://orcid.org/0000-0002-9412-9643 rorid: null typeEntity: Person typeRole: - Primary contact url: null description: Prediction of the presence and location of signal peptide cleavage sites in amino acid sequences from different organisms. documentation: - note: null type: - General url: http://www.cbs.dtu.dk/services/SignalP download: - note: null type: Source code url: http://www.cbs.dtu.dk/cgi-bin/sw_request?signalp version: null - note: null type: Binaries url: http://www.cbs.dtu.dk/cgi-bin/sw_request?signalp version: null editPermission: authors: - CBS type: group elixirCommunity: [] elixirNode: [] elixirPlatform: [] elixir_badge: 0 function: - cmd: null input: - data: term: Sequence uri: http://edamontology.org/data_2044 format: - term: FASTA uri: http://edamontology.org/format_1929 note: predicts the presence and location of signal peptide cleavage sites in amino acid sequences from different organisms operation: - term: Protein signal peptide detection uri: http://edamontology.org/operation_0418 - term: Protein cleavage site prediction uri: http://edamontology.org/operation_0422 output: - data: term: Protein features uri: http://edamontology.org/data_1277 format: - term: GFF uri: http://edamontology.org/format_2305 - data: term: Sequence report uri: http://edamontology.org/data_2955 format: [] homepage: http://cbs.dtu.dk/services/SignalP/ homepage_status: 0 language: [] lastUpdate: '2019-11-11T14:45:04Z' license: Other link: - note: null type: - Repository url: http://www.cbs.dtu.dk/cgi-bin/sw_request?signalp maturity: Mature name: SignalP operatingSystem: - Linux - Mac otherID: - type: rrid value: rrid:SCR_015644 version: null owner: cbs_admin publication: - doi: 10.1038/nmeth.1701 metadata: abstract: '' authors: - name: Petersen T.N. - name: Brunak S. - name: Von Heijne G. - name: Nielsen H. citationCount: 6571 date: '2011-10-01T00:00:00Z' journal: Nature Methods title: 'SignalP 4.0: Discriminating signal peptides from transmembrane regions' note: null pmcid: null pmid: '21959131' type: - Primary version: null relation: [] toolType: - Command-line tool - Web application topic: - term: Protein sites, features and motifs uri: http://edamontology.org/topic_3510 validated: 1 version: - '4.1' - accessibility: null additionDate: '2015-01-21T13:29:25Z' biotoolsCURIE: biotools:tatp biotoolsID: tatp collectionID: [] community: null confidence_flag: null cost: Free of charge (with restrictions) credit: - email: null fundrefid: null gridid: null name: CBS note: null orcidid: null rorid: null typeEntity: Institute typeRole: - Provider url: null - email: hnielsen@cbs.dtu.dk fundrefid: null gridid: null name: Henrik Nielsen note: null orcidid: http://orcid.org/0000-0002-9412-9643 rorid: null typeEntity: Person typeRole: - Primary contact url: null description: Prediction of the presence and location of Twin-arginine signal peptide cleavage sites in bacteria. documentation: - note: null type: - General url: http://www.cbs.dtu.dk/services/TatP/instructions.php download: [] editPermission: authors: [] type: private elixirCommunity: [] elixirNode: [] elixirPlatform: [] elixir_badge: 0 function: - cmd: null input: - data: term: Sequence uri: http://edamontology.org/data_2044 format: - term: FASTA uri: http://edamontology.org/format_1929 note: predicts the presence and location of Twin-arginine signal peptide cleavage sites in bacteria. Signal peptide/non-signal peptide prediction based on a combination of two artificial neural networks operation: - term: Protein signal peptide detection uri: http://edamontology.org/operation_0418 - term: Protein cleavage site prediction uri: http://edamontology.org/operation_0422 output: - data: term: Sequence report uri: http://edamontology.org/data_2955 format: - term: Binary format uri: http://edamontology.org/format_2333 homepage: http://cbs.dtu.dk/services/TatP/ homepage_status: 0 language: [] lastUpdate: '2018-12-16T14:15:31Z' license: Other link: - note: null type: - Software catalogue url: http://cbs.dtu.dk/services maturity: Emerging name: TatP operatingSystem: - Linux otherID: [] owner: CBS publication: - doi: 10.1186/1471-2105-6-167 metadata: abstract: 'Background: Proteins carrying twin-arginine (Tat) signal peptides are exported into the periplasmic compartment or extracellular environment independently of the classical Sec-dependent translocation pathway. To complement other methods for classical signal peptide prediction we here present a publicly available method, TatP, for prediction of bacterial Tat signal peptides. Results: We have retrieved sequence data for Tat substrates in order to train a computational method for discrimination of Sec and Tat signal peptides. The TatP method is able to positively classify 91% of 35 known Tat signal peptides and 84% of the annotated cleavage sites of these Tat signal peptides were correctly predicted. This method generates far less false positive predictions on various datasets than using simple pattern matching. Moreover, on the same datasets TatP generates less false positive predictions than a complementary rule based prediction method. Conclusion: The method developed here is able to discriminate Tat signal peptides from cytoplasmic proteins carrying a similar motif, as well as from Sec signal peptides, with high accuracy. The method allows filtering of input sequences based on Perl syntax regular expressions, whereas hydrophobicity discrimination of Tat- and Sec-signal peptides is carried out by an artificial neural network. A potential cleavage site of the predicted Tat signal peptide is also reported. The TatP prediction server is available as a public web server at http://www.cbs.dtu.dk/ services/TatP/. © 2005 Bendtsen et al; licensee BioMed Central Ltd.' authors: - name: Bendtsen J.D. - name: Nielsen H. - name: Widdick D. - name: Palmer T. - name: Brunak S. citationCount: 377 date: '2005-07-02T00:00:00Z' journal: BMC Bioinformatics title: Prediction of twin-arginine signal peptides note: null pmcid: PMC1182353 pmid: '15992409' type: - Primary version: null relation: [] toolType: - Web application - Web service topic: - term: Sequence sites, features and motifs uri: http://edamontology.org/topic_0160 validated: 1 version: - '1.0' - accessibility: null additionDate: '2015-01-21T13:29:15Z' biotoolsCURIE: biotools:netacet biotoolsID: netacet collectionID: [] community: null confidence_flag: null cost: Free of charge (with restrictions) credit: - email: null fundrefid: null gridid: null name: CBS note: null orcidid: null rorid: null typeEntity: Institute typeRole: - Provider url: null - email: hnielsen@cbs.dtu.dk fundrefid: null gridid: null name: Henrik Nielsen note: null orcidid: null rorid: null typeEntity: Person typeRole: - Primary contact url: null - email: nblom@kt.dtu.dk fundrefid: null gridid: null name: Nicolaj Sorgenfrei Blom note: null orcidid: http://orcid.org/0000-0001-7787-7853 rorid: null typeEntity: Person typeRole: [] url: null description: Prediction of substrates of N-acetyltransferase A (NatA). documentation: - note: null type: - General url: http://www.cbs.dtu.dk/services/NetAcet/ download: [] editPermission: authors: [] type: private elixirCommunity: [] elixirNode: [] elixirPlatform: [] elixir_badge: 0 function: - cmd: null input: - data: term: Sequence uri: http://edamontology.org/data_2044 format: - term: FASTA uri: http://edamontology.org/format_1929 note: predicts substrates of N-acetyltransferase A (NatA) operation: - term: Prediction and recognition uri: http://edamontology.org/operation_2423 output: - data: term: Protein features uri: http://edamontology.org/data_1277 format: - term: Textual format uri: http://edamontology.org/format_2330 homepage: http://cbs.dtu.dk/services/NetAcet/ homepage_status: 0 language: [] lastUpdate: '2018-12-16T13:48:08Z' license: Other link: - note: null type: - Software catalogue url: http://cbs.dtu.dk/services maturity: Emerging name: NetAcet operatingSystem: - Linux otherID: [] owner: CBS publication: - doi: 10.1093/bioinformatics/bti130 metadata: abstract: 'Summary: We present here a neural network based method for prediction of N-terminal acetylation - by far the most abundant post-translational modification in eukaryotes. The method was developed on a yeast dataset for N-acetyltransferase A (NatA) acetylation, which is the type of N-acetylation for which most examples are known and for which orthologs have been found in several eukaryotes. We obtain correlation coefficients close to 0.7 on yeast data and a sensitivity up to 74% on mammalian data, suggesting that the method is valid for eukaryotic NatA orthologs. © The Author 2004. Published by Oxford University Press. All rights reserved.' authors: - name: Kiemer L. - name: Bendtsen J.D. - name: Blom N. citationCount: 103 date: '2005-04-01T00:00:00Z' journal: Bioinformatics title: 'NetAcet: Prediction of N-terminal acetylation sites' note: null pmcid: null pmid: '15539450' type: - Primary version: null relation: [] toolType: - Web application topic: - term: Sequence analysis uri: http://edamontology.org/topic_0080 validated: 1 version: - '1.0' - accessibility: null additionDate: '2015-01-21T13:29:24Z' biotoolsCURIE: biotools:secretomep biotoolsID: secretomep collectionID: [] community: null confidence_flag: null cost: Free of charge (with restrictions) credit: - email: null fundrefid: null gridid: null name: CBS note: null orcidid: null rorid: null typeEntity: Institute typeRole: - Provider url: null - email: hnielsen@cbs.dtu.dk fundrefid: null gridid: null name: Henrik Nielsen note: null orcidid: null rorid: null typeEntity: Person typeRole: - Primary contact url: null - email: nblom@kt.dtu.dk fundrefid: null gridid: null name: Nikolaj Sorgenfrei Blom note: null orcidid: http://orcid.org/0000-0001-7787-7853 rorid: null typeEntity: Person typeRole: [] url: null description: Predictions of non-classical (i.e. not signal peptide triggered) protein secretion. documentation: - note: null type: - General url: http://www.cbs.dtu.dk/services/SecretomeP/instructions.php download: [] editPermission: authors: [] type: private elixirCommunity: [] elixirNode: [] elixirPlatform: [] elixir_badge: 0 function: - cmd: null input: - data: term: Sequence uri: http://edamontology.org/data_2044 format: - term: FASTA uri: http://edamontology.org/format_1929 note: predicts of non-classical protein secretion operation: - term: Protein subcellular localisation prediction uri: http://edamontology.org/operation_2489 output: - data: term: Protein features uri: http://edamontology.org/data_1277 format: - term: Textual format uri: http://edamontology.org/format_2330 homepage: http://cbs.dtu.dk/services/SecretomeP/ homepage_status: 0 language: [] lastUpdate: '2018-12-16T13:24:34Z' license: Other link: - note: null type: - Software catalogue url: http://cbs.dtu.dk/services maturity: Emerging name: SecretomeP operatingSystem: - Linux otherID: [] owner: CBS publication: - doi: 10.1093/protein/gzh037 metadata: abstract: We present a sequence-based method, SecretomeP, for the prediction of mammalian secretory proteins targeted to the non-classical secretory pathway, i.e. proteins without an N-terminal signal peptide. So far only a limited number of proteins have been shown experimentally to enter the non-classical secretory pathway. These are mainly fibroblast growth factors, interleukins and galectins found in the extracellular matrix. We have discovered that certain pathway-independent features are shared among secreted proteins. The method presented here is also capable of predicting (signal peptide-containing) secretory proteins where only the mature part of the protein has been annotated or cases where the signal peptide remains uncleaved. By scanning the entire human proteome we identified new proteins potentially undergoing non-classical secretion. Predictions can be made at http://www.cbs.dtu.dk/services/SecretomeP. authors: - name: Bendtsen J.D. - name: Jensen L.J. - name: Blom N. - name: Von Heijne G. - name: Brunak S. citationCount: 863 date: '2004-04-01T00:00:00Z' journal: Protein Engineering, Design and Selection title: Feature-based prediction of non-classical and leaderless protein secretion note: null pmcid: null pmid: '15115854' type: - Primary version: null - doi: null metadata: abstract: 'Background: We present an overview of bacterial non-classical secretion and a prediction method for identification of proteins following signal peptide independent secretion pathways. We have compiled a list of proteins found extracellularly despite the absence of a signal peptide. Some of these proteins also have known roles in the cytoplasm, which means they could be so-called "moon-lightning" proteins having more than one function. Results: A thorough literature search was conducted to compile a list of currently known bacterial non-classically secreted proteins. Pattern finding methods were applied to the sequences in order to identify putative signal sequences or motifs responsible for their secretion. We have found no signal or motif characteristic to any majority of the proteins in the compiled list of non-classically secreted proteins, and conclude that these proteins, indeed, seem to be secreted in a novel fashion. However, we also show that the apparently non-classically secreted proteins are still distinguished from cellular proteins by properties such as amino acid composition, secondary structure and disordered regions. Specifically, prediction of disorder reveals that bacterial secretory proteins are more structurally disordered than their cytoplasmic counterparts. Finally, artificial neural networks were used to construct protein feature based methods for identification of non-classically secreted proteins in both Gram-positive and Gram-negative bacteria. Conclusion: We present a publicly available prediction method capable of discriminating between this group of proteins and other proteins, thus allowing for the identification of novel non-classically secreted proteins. We suggest candidates for non-classically secreted proteins in Escherichia coli and Bacillus subtilis. The prediction method is available online. © 2005 Bendtsen et al; licensee BioMed Central Ltd.' authors: - name: Bendtsen J.D. - name: Kiemer L. - name: Fausboll A. - name: Brunak S. citationCount: 454 date: '2005-10-07T00:00:00Z' journal: BMC Microbiology title: Non-classical protein secretion in bacteria note: null pmcid: null pmid: '16212653' type: - Other version: null relation: [] toolType: - Command-line tool - Web application topic: - term: Sequence analysis uri: http://edamontology.org/topic_0080 validated: 1 version: - '2.0' - accessibility: null additionDate: '2015-06-29T10:27:52Z' biotoolsCURIE: biotools:targetp biotoolsID: targetp collectionID: [] community: null confidence_flag: null cost: Free of charge (with restrictions) credit: - email: null fundrefid: null gridid: null name: CBS note: null orcidid: null rorid: null typeEntity: Institute typeRole: - Provider url: null - email: hnielsen@cbs.dtu.dk fundrefid: null gridid: null name: Henrik Nielsen note: null orcidid: http://orcid.org/0000-0002-9412-9643 rorid: null typeEntity: Person typeRole: - Primary contact url: null description: Prediction of the subcellular location of eukaryotic proteins. documentation: - note: null type: - General url: http://www.cbs.dtu.dk/services/TargetP/instructions.php download: [] editPermission: authors: [] type: private elixirCommunity: [] elixirNode: [] elixirPlatform: [] elixir_badge: 0 function: - cmd: null input: - data: term: Sequence uri: http://edamontology.org/data_2044 format: - term: FASTA uri: http://edamontology.org/format_1929 note: Predicts the subcellular location of eukaryotic proteins operation: - term: Protein cleavage site prediction uri: http://edamontology.org/operation_0422 output: - data: term: Protein features uri: http://edamontology.org/data_1277 format: - term: Textual format uri: http://edamontology.org/format_2330 homepage: http://cbs.dtu.dk/services/TargetP/ homepage_status: 0 language: [] lastUpdate: '2018-12-14T09:42:48Z' license: Other link: - note: null type: - Software catalogue url: http://cbs.dtu.dk/services maturity: Emerging name: TargetP operatingSystem: - Linux otherID: [] owner: CBS publication: - doi: 10.1006/jmbi.2000.3903 metadata: abstract: A neural network-based tool, TargetP, for large-scale subcellular location prediction of newly identified proteins has been developed. Using N-terminal sequence information only, it discriminates between proteins destined for the mitochondrion, the chloroplast, the secretory pathway, and 'other' localizations with a success rate of 85% (plant) or 90% (non-plant) on redundancy-reduced test sets. From a TargetP analysis of the recently sequenced Arabidopsis thaliana chromosomes 2 and 4 and the Ensembl Homo sapiens protein set, we estimate that 10% of all plant proteins are mitochondrial and 14% chloroplastic, and that the abundance of secretory proteins, in both Arabidopsis and Homo, is around 10%. TargetP also predicts cleavage sites with levels of correctly predicted sites ranging from approximately 40% to 50% (chloroplastic and mitochondrial presequences) to above 70% (secretory signal peptides). TargetP is available as a web-server at http://www.cbs.dtu.dk/services/TargetP/. (C) 2000 Academic Press. authors: - name: Emanuelsson O. - name: Nielsen H. - name: Brunak S. - name: Von Heijne G. citationCount: 3470 date: '2000-07-21T00:00:00Z' journal: Journal of Molecular Biology title: Predicting subcellular localization of proteins based on their N-terminal amino acid sequence note: null pmcid: null pmid: '10891285' type: - Primary version: null relation: [] toolType: - Command-line tool - Web application - Web service topic: - term: Protein sites, features and motifs uri: http://edamontology.org/topic_3510 validated: 1 version: - '1.1' - accessibility: null additionDate: '2015-09-12T12:50:16Z' biotoolsCURIE: biotools:chlorop biotoolsID: chlorop collectionID: [] community: null confidence_flag: null cost: Free of charge (with restrictions) credit: - email: null fundrefid: null gridid: null name: CBS note: null orcidid: null rorid: null typeEntity: Institute typeRole: - Provider url: null - email: null fundrefid: null gridid: null name: null note: null orcidid: null rorid: null typeEntity: Person typeRole: - Primary contact url: http://www.bioinformatics.dtu.dk/english/Service/Contact - email: hnielsen@bioinformatics.dtu.dk fundrefid: null gridid: null name: Henrik Nielsen note: null orcidid: http://orcid.org/0000-0002-9412-9643 rorid: null typeEntity: null typeRole: [] url: null description: Prediction of presence of chloroplast transit peptides and their cleavage sites in plant proteins. documentation: - note: null type: - Citation instructions url: http://www.cbs.dtu.dk/services/ChloroP - note: null type: - General url: http://www.cbs.dtu.dk/services/ChloroP/pages/instr.php download: - note: null type: Source code url: http://www.cbs.dtu.dk/services/doc/chlorop-1.1.readme version: null - note: null type: Binaries url: http://www.cbs.dtu.dk/services/doc/chlorop-1.1.readme version: null editPermission: authors: [] type: private elixirCommunity: [] elixirNode: [] elixirPlatform: [] elixir_badge: 0 function: - cmd: null input: - data: term: Sequence uri: http://edamontology.org/data_2044 format: - term: FASTA uri: http://edamontology.org/format_1929 note: predicts the presence of chloroplast transit peptides (cTP) operation: - term: Protein cleavage site prediction uri: http://edamontology.org/operation_0422 output: - data: term: Protein features uri: http://edamontology.org/data_1277 format: - term: Textual format uri: http://edamontology.org/format_2330 homepage: http://cbs.dtu.dk/services/ChloroP/ homepage_status: 0 language: [] lastUpdate: '2018-12-14T09:40:07Z' license: Other link: - note: null type: - Repository url: http://www.cbs.dtu.dk/services/doc/chlorop-1.1.readme maturity: Mature name: ChloroP operatingSystem: - Linux - Windows - Mac otherID: [] owner: CBS publication: - doi: 10.1110/ps.8.5.978 metadata: abstract: We present a neural network based method (ChloroP) for identifying chloroplast transit peptides and their cleavage sites. Using cross- validation, 88% of the sequences in our homology reduced training set were correctly classified as transit peptides or nontransit peptides. This performance level is well above that of the publicly available chloroplast localization predictor PSORT. Cleavage sites are predicted using a scoring matrix derived by an automatic motif-finding algorithm. Approximately 60% of the known cleavage sites in our sequence collection were predicted to within ±2 residues from the cleavage sites given in SWISS-PROT. An analysis of 715 Arabidopsis thaliana sequences from SWISS-PROT suggests that the ChloroP method should be useful for the identification of putative transit peptides in genome-wide sequence data. The ChloroP predictor is available as a web- server at http://www.cbs.dtu.dk/services/ChloroP/. authors: - name: Emanuelsson O. - name: Nielsen H. - name: Von Heijne G. citationCount: 1467 date: '1999-01-01T00:00:00Z' journal: Protein Science title: ChloroP, a neural network-based method for predicting chloroplast transit peptides and their cleavage sites note: null pmcid: PMC2144330 pmid: '10338008' type: - Primary version: null - doi: 10.1110/ps.8.5.978 metadata: abstract: We present a neural network based method (ChloroP) for identifying chloroplast transit peptides and their cleavage sites. Using cross- validation, 88% of the sequences in our homology reduced training set were correctly classified as transit peptides or nontransit peptides. This performance level is well above that of the publicly available chloroplast localization predictor PSORT. Cleavage sites are predicted using a scoring matrix derived by an automatic motif-finding algorithm. Approximately 60% of the known cleavage sites in our sequence collection were predicted to within ±2 residues from the cleavage sites given in SWISS-PROT. An analysis of 715 Arabidopsis thaliana sequences from SWISS-PROT suggests that the ChloroP method should be useful for the identification of putative transit peptides in genome-wide sequence data. The ChloroP predictor is available as a web- server at http://www.cbs.dtu.dk/services/ChloroP/. authors: - name: Emanuelsson O. - name: Nielsen H. - name: Von Heijne G. citationCount: 1467 date: '1999-01-01T00:00:00Z' journal: Protein Science title: ChloroP, a neural network-based method for predicting chloroplast transit peptides and their cleavage sites note: null pmcid: PMC2144330 pmid: '10338008' type: - Other version: null relation: [] toolType: - Command-line tool - Web application topic: - term: Protein sites, features and motifs uri: http://edamontology.org/topic_3510 validated: 1 version: - '1.1' - accessibility: null additionDate: '2017-08-25T13:20:00Z' biotoolsCURIE: biotools:deeploc biotoolsID: deeploc collectionID: [] community: null confidence_flag: null cost: null credit: - email: hnielsen@cbs.dtu.dk fundrefid: null gridid: null name: Henrik Nielsen note: null orcidid: null rorid: null typeEntity: Person typeRole: - Primary contact url: null description: Prediction of eukaryotic protein subcellular localization using deep learning. documentation: - note: null type: - User manual url: http://www.cbs.dtu.dk/services/DeepLoc/instructions.php download: [] editPermission: authors: [] type: private elixirCommunity: [] elixirNode: [] elixirPlatform: [] elixir_badge: 0 function: - cmd: null input: - data: term: Sequence set (protein) uri: http://edamontology.org/data_1233 format: - term: FASTA uri: http://edamontology.org/format_1929 note: null operation: - term: Protein subcellular localisation prediction uri: http://edamontology.org/operation_2489 output: - data: term: Protein features uri: http://edamontology.org/data_1277 format: - term: HTML uri: http://edamontology.org/format_2331 - data: term: Plot uri: http://edamontology.org/data_2884 format: - term: PNG uri: http://edamontology.org/format_3603 homepage: http://www.cbs.dtu.dk/services/DeepLoc/ homepage_status: 0 language: [] lastUpdate: '2018-12-10T12:58:53Z' license: Other link: [] maturity: Mature name: DeepLoc operatingSystem: - Linux - Windows - Mac otherID: [] owner: hnielsen@cbs.dtu.dk publication: - doi: 10.1093/bioinformatics/btx431 metadata: abstract: '© The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.comMotivation: The prediction of eukaryotic protein subcellular localization is a well-studied topic in bioinformatics due to its relevance in proteomics research. Many machine learning methods have been successfully applied in this task, but in most of them, predictions rely on annotation of homologues from knowledge databases. For novel proteins where no annotated homologues exist, and for predicting the effects of sequence variants, it is desirable to have methods for predicting protein properties from sequence information only.Results: Here, we present a prediction algorithm using deep neural networks to predict protein subcellular localization relying only on sequence information. At its core, the prediction model uses a recurrent neural network that processes the entire protein sequence and an attention mechanism identifying protein regions important for the subcellular localization. The model was trained and tested on a protein dataset extracted from one of the latest UniProt releases, in which experimentally annotated proteins follow more stringent criteria than previously. We demonstrate that our model achieves a good accuracy (78% for 10 categories; 92% for membrane-bound or soluble), outperforming current state-of-the-art algorithms, including those relying on homology information.Availability and implementation: The method is available as a web server at http://www.cbs.dtu.dk/services/DeepLoc. Example code is available at https://github.com/JJAlmagro/subcellular_localization. The dataset is available at http://www.cbs.dtu.dk/services/DeepLoc/data.php.Contact: jjalma@dtu.dk.' authors: - name: Almagro Armenteros J.J. - name: Sonderby C.K. - name: Sonderby S.K. - name: Nielsen H. - name: Winther O. citationCount: 311 date: '2017-11-01T00:00:00Z' journal: Bioinformatics (Oxford, England) title: 'DeepLoc: prediction of protein subcellular localization using deep learning' note: null pmcid: null pmid: null type: [] version: null relation: [] toolType: - Web service topic: - term: Sequence analysis uri: http://edamontology.org/topic_0080 - term: Protein properties uri: http://edamontology.org/topic_0123 validated: 1 version: - '1.0' - accessibility: null additionDate: '2015-10-01T21:33:09Z' biotoolsCURIE: biotools:loctree3 biotoolsID: loctree3 collectionID: - Rostlab tools - PredictProtein community: null confidence_flag: null cost: Free of charge credit: - email: null fundrefid: null gridid: null name: Maximilian Hecht note: null orcidid: null rorid: null typeEntity: Person typeRole: - Developer url: null - email: null fundrefid: null gridid: null name: Guy Yachdav note: null orcidid: null rorid: null typeEntity: Person typeRole: - Developer url: null - email: null fundrefid: null gridid: null name: Timothy Karl note: null orcidid: null rorid: null typeEntity: Person typeRole: - Developer url: null - email: null fundrefid: null gridid: null name: Tobias Hamp note: null orcidid: null rorid: null typeEntity: Person typeRole: - Developer url: null - email: null fundrefid: null gridid: null name: Tatyana Goldberg note: null orcidid: null rorid: null typeEntity: Person typeRole: - Developer url: null - email: null fundrefid: null gridid: null name: Burkhard Rost note: null orcidid: null rorid: null typeEntity: null typeRole: - Contributor url: null - email: null fundrefid: null gridid: null name: Henrik Nielsen note: null orcidid: null rorid: null typeEntity: null typeRole: - Contributor url: null - email: null fundrefid: null gridid: null name: Technische Universität München note: null orcidid: null rorid: null typeEntity: Institute typeRole: - Provider url: null - email: null fundrefid: null gridid: null name: Alexander von Humboldt Foundation through German Federal Ministry for Education and Research note: null orcidid: null rorid: null typeEntity: Funding agency typeRole: - Contributor url: null - email: null fundrefid: null gridid: null name: Ernst Ludwig Ehrlich Studienwerk note: null orcidid: null rorid: null typeEntity: Funding agency typeRole: - Contributor url: null - email: null fundrefid: null gridid: null name: RostLab note: null orcidid: null rorid: null typeEntity: Institute typeRole: - Provider url: null - email: localization@rostlab.org fundrefid: null gridid: null name: null note: null orcidid: null rorid: null typeEntity: Person typeRole: - Primary contact url: https://rostlab.org/services/loctree3/ description: Prediction of protein subcellular localization in 18 classes for eukaryota, 6 for bacteria and 3 for archaea. documentation: - note: null type: - General url: https://rostlab.org/owiki/index.php/Loctree3 download: - note: null type: Source code url: https://rostlab.org/owiki/index.php/Packages version: null - note: null type: Binaries url: https://rostlab.org/owiki/index.php/Packages version: null editPermission: authors: [] type: private elixirCommunity: [] elixirNode: [] elixirPlatform: [] elixir_badge: 0 function: - cmd: null input: - data: term: Protein sequence record uri: http://edamontology.org/data_2886 format: - term: FASTA uri: http://edamontology.org/format_1929 note: 'Prediction of protein sucellular localization in 18 classes for eukaryota, 6 classes for bacteria and 3 for archaea using homology searches (PSI-BLAST) and machine learning (SVM) User can provide one or more sequences in FASTA format The prediction output contains: prediction score, from 1 (weak prediction) to 100 (strong prediction); one of 18 localization classes for eukaryota, 6 for bacteria and 3 for archaea; GO identifier; GO term; prediction source (PSI-BLAST or SVM)' operation: - term: Protein subcellular localisation prediction uri: http://edamontology.org/operation_2489 output: - data: term: Protein report uri: http://edamontology.org/data_0896 format: - term: TSV uri: http://edamontology.org/format_3475 homepage: https://rostlab.org/services/loctree3/ homepage_status: 0 language: - PHP - Java - Perl - JavaScript lastUpdate: '2018-12-10T12:58:51Z' license: GPL-3.0 link: - note: null type: - Repository url: https://rostlab.org/owiki/index.php/Packages maturity: Mature name: LocTree3 operatingSystem: - Linux otherID: [] owner: RostLab publication: - doi: 10.1093/nar/gku396 metadata: abstract: The prediction of protein sub-cellular localization is an important step toward elucidating protein function. For each query protein sequence, LocTree2 applies machine learning (profile kernel SVM) to predict the native sub-cellular localization in 18 classes for eukaryotes, in six for bacteria and in three for archaea. The method outputs a score that reflects the reliability of each prediction. LocTree2 has performed on par with or better than any other state-of-the-art method. Here, we report the availability of LocTree3 as a public web server. The server includes the machine learning-based LocTree2 and improves over it through the addition of homology-based inference. Assessed on sequence-unique data, LocTree3 reached an 18-state accuracy Q18 = 80 ± 3% for eukaryotes and a six-state accuracy Q6 = 89 ± 4% for bacteria. The server accepts submissions ranging from single protein sequences to entire proteomes. Response time of the unloaded server is about 90 s for a 300-residue eukaryotic protein and a few hours for an entire eukaryotic proteome not considering the generation of the alignments. For over 1000 entirely sequenced organisms, the predictions are directly available as downloads. The web server is available at http://www.rostlab.org/services/loctree3. © 2014 The Author(s). authors: - name: Goldberg T. - name: Hecht M. - name: Hamp T. - name: Karl T. - name: Yachdav G. - name: Ahmed N. - name: Altermann U. - name: Angerer P. - name: Ansorge S. - name: Balasz K. - name: Bernhofer M. - name: Betz A. - name: Cizmadija L. - name: Do K.T. - name: Gerke J. - name: Greil R. - name: Joerdens V. - name: Hastreiter M. - name: Hembach K. - name: Herzog M. - name: Kalemanov M. - name: Kluge M. - name: Meier A. - name: Nasir H. - name: Neumaier U. - name: Prade V. - name: Reeb J. - name: Sorokoumov A. - name: Troshani I. - name: Vorberg S. - name: Waldraff S. - name: Zierer J. - name: Nielsen H. - name: Rost B. citationCount: 158 date: '2014-07-01T00:00:00Z' journal: Nucleic Acids Research title: LocTree3 prediction of localization note: null pmcid: null pmid: null type: - Primary version: null - doi: 10.1093/bioinformatics/bts390 metadata: abstract: 'Motivation: Subcellular localization is one aspect of protein function. Despite advances in high-throughput imaging, localization maps remain incomplete. Several methods accurately predict localization, but many challenges remain to be tackled. Results: In this study, we introduced a framework to predict localization in life''s three domains, including globular and membrane proteins (3 classes for archaea; 6 for bacteria and 18 for eukaryota). The resulting method, LocTree2, works well even for protein fragments. It uses a hierarchical system of support vector machines that imitates the cascading mechanism of cellular sorting. The method reaches high levels of sustained performance (eukaryota: Q18=65%, bacteria: Q6=84%). LocTree2 also accurately distinguishes membrane and non-membrane proteins. In our hands, it compared favorably with top methods when tested on new data. © The Author(s) 2012. Published by Oxford University Press.' authors: - name: Goldberg T. - name: Hamp T. - name: Rost B. citationCount: 75 date: '2012-09-01T00:00:00Z' journal: Bioinformatics title: LocTree2 predicts localization for all domains of life note: null pmcid: null pmid: null type: - Other version: null relation: [] toolType: - Command-line tool - Web application topic: - term: Sequence analysis uri: http://edamontology.org/topic_0080 - term: Protein properties uri: http://edamontology.org/topic_0123 - term: Cell biology uri: http://edamontology.org/topic_2229 validated: 1 version: - 1.0.8 next: ?page=2 previous: null