Aspergillosis is an umbrella term for a wide array of infections caused by Aspergillus species. Although aspergillosis is caused by multiple Aspergillus species, the majority of reported cases originate from ten species: Aspergillus fumigatus, Aspergillus flavus, Aspergillus niger, Aspergillus terreus, Aspergillus versicolor, Aspergillus lentulus, Aspergillus nidulans, Aspergillus glaucus, Aspergillus oryzae and Aspergillus ustus.

Conidia phialoconidia of Aspergillus fumigatus PHIL 300 lores
Aspergillus fumigatus
A. fumigatus alone is responsible for over 90% of the reported Aspergillosis cases. The airborne conidia released by A. fumigatus is ubiquitous in the environment and is constantly inhaled by human beings. In healthy human beings, the conidia are quickly eliminated by the immune system. However, in immune-compromised patients, Aspergillus can become an opportunistic pathogen. The mortality rate associated with aspergillosis and the emergence of resistant strains to current drugs reveals an urgent need to identify new targets to develop novel drugs and vaccines.

Fungal proteins secreted to extracellular matrix and cell membrane are known to play an important role in host-parasite interactions and establishment of infection. In this work, we have designed a computational pipeline to integrate data from high-throughput proteomic experiments and bioinformatic-based predictions to identify proteins secreted by fungi to extracellular matrix and cell membrane (Figure 1). We have used the computational pipeline to predict secreted proteins and cell membrane proteins in ten Aspergillus species causing aspergillosis. In addition, we have identified small secreted proteins (SSPs) and effector-like proteins (similar to agents of fungal plant pathogenesis) within the secretomes of Aspergillus species.

A comparison of the Aspergillus secretomes to the human proteome revealed that at least 70% of each secretome did not share sequence similarity with humans and could become future candidates for fungal-specific drug targets. Moreover, an analysis of the antigenic qualities of the Aspergillus proteins revealed that the secreted proteins to extracellular matrix were significantly more antigenic than the cell membrane proteins or the complete proteome.


Computational pipeline for predicting fungal secretome

Browse

Figure Pipeline

Figure 1: Schematic overview of the fungal secretome prediction pipeline.
Citation:

If you use our dataset, please cite:
Comparative systems analysis of the secretome of the opportunistic pathogen Aspergillus fumigatus and other Aspergillus species,
R.P. Vivek-Ananth, Karthikeyan Mohanraj, Muralidharan Vandanashree, Anupam Jhingran, James P. Craig and Areejit Samal*, Scientific Reports 8: 6617 (2018).
Supplementary tables of our manuscript include spreadsheets containing data hosted on this page.

Acknowledgements:

We are thankful to the developers of following software and databases which were used to construct the Aspertome page.

The proteomes of Aspergillus species were obtained from following databases:
AspGD
FungiDB
Ensembl Genomes

Following bioinformatic-based prediction tools were used in our computational pipeline:
SignalP 4.1
Phobius
PredGPI
bigPI
TMHMM
PS SCAN
WoLF PSORT
TargetP 1.1
ProtComp 6

Following software was used to predict Effector-like small secreted proteins (SSPs):
EffectorP

Following database was used to predict sequence homologs of Aspergillus proteins with known drug target proteins:
DrugBank

Following databases were used to functionally annotate the secreted proteins:
Pfam
CAZy
dbCAN v5.0
HMMER3
TIGRFAM
SFLD
SMART
CDD
SUPERFAMILY
PRINTS
PANTHER
COILS
MobiDB-lite
FungiFun2
InterPro version 64.0

Following resources were employed to develop the web-interface:
Drupal
MySQL
PHP
APACHE

Our computational pipeline starts from the complete proteome of an Aspergillus species considered here.

In the initial phase of the pipeline, functional annotation on subcellular localization of the proteins with published experimental evidence was gathered from UniProt database. Firstly, intracellular proteins based on UniProt annotation on subcellular localization with experimental evidence were filtered and excluded from subsequent steps in the pipeline (Figure 1). Secondly, the remaining set of proteins with no experimental evidence for intracellular localization was classified into two mutually exclusive categories of proteins (Figure 1). The first category contained secreted extracellular proteins or cell membrane proteins based on UniProt annotation on subcellular localization with experimental evidence or compiled lists from high-throughput proteomic studies, and the second category contained proteins without experimental evidence from UniProt or high-throughput proteomic studies of being either secreted to the extracellular matrix or localized to the cell membrane.

The first category of experimentally verified proteins were subsequently checked for a signal peptide (using SignalP 4.1, Phobius and UniProt annotation with experimental evidence), Glycosylphosphatidylinositol (GPI) anchor (using PredGPI , big-PI and UniProt annotation with experimental evidence) or Transmembrane (TM) domain (using TMHMM 2.0, Phobius and UniProt annotation with experimental evidence), confirming passage through the classical secretory pathway (Branch A in Figure 1). Classical secretory pathway proteins were filtered by their predicted GPI anchors or TM domains to separate out cell membrane proteins from extracellular proteins (Branch A in Figure 1). Proteins without a predicted signal peptide, GPI anchor and TM domain but with experimental evidence from UniProt annotation or high-throughput proteomic studies of being secreted were classified as extracellular proteins secreted through a non-classical secretion pathway (Branch A in Figure 1).

The second category of proteins without experimental evidence from UniProt annotation or high-throughput proteomic studies were subsequently screened for localization using computational predictive tools as follows. First, the second category of proteins was screened for a signal peptide, GPI anchor or TM domain, suggesting translocation into the endoplasmic reticulum (ER) and their sorting via the classical secretion pathway (Figure 1). Next, the proteins with a signal peptide, GPI anchor or TM domain but also with an ER retention signal (determined using PS SCAN with PROSITE pattern PS00014) were excluded from later analysis (Branch B in Figure 1). Next the proteins predicted to have a GPI anchor or TM domain and with predicted subcellular localization as cell membrane (using WoLF PSORT 0.2, TargetP 1.1, ProtComp 6 and UniProt annotation with experimental evidence) were classified as cell membrane proteins, and proteins predicted to have neither GPI anchor or TM domain and with predicted subcellular localization as extracellular were classified as extracellular proteins sorted by classical secretion pathway (Branch B in Figure 1).

Lastly, the subset of the second category of proteins without experimental evidence from UniProt annotation or high-throughput proteomic studies, and in addition, also lacking a signal peptide, GPI anchor and TM domain, were checked if they were orthologs to known secreted proteins from other fungal species using OrthoMCL (Branch C in Figure 1). Next, those proteins in the subset which are orthologs of experimentally identified secreted proteins in other fungi were assessed for an ER retention signal and their predicted subcellular localization (Branch C in Figure 1). Proteins without an ER retention signal and predicted subcellular localization as extracellular were classified as extracellular proteins secreted through non-classical secretion pathway (Branch C in Figure 1). Note that, in this work, we decided to employ a method based on orthology to experimentally verified secreted proteins in all fungi to predict proteins passing via non-classical pathway.

In our computational pipeline for secretome prediction (Figure 1):
(a) Prediction of signal peptides in N-terminus of protein sequences is based on SignalP 4.1 predictions, Phobius predictions and UniProt annotations with published experimental evidence, if available.
(b) Prediction of GPI anchors in the protein sequences is based on PredGPI predictions, big-PI predictions and UniProt annotations with published experimental evidence, if available.
(c) Prediction of TM domains in the protein sequences is based on TMHMM 2.0 predictions, Phobius predictions and UniProt annotations with published experimental evidence, if available.
(d) ER resident proteins were identified based on ER retention signal predictions by PS SCAN.
(e) Prediction of subcellular localization of proteins is based on WoLF PSORT 0.2 predictions, TargetP 1.1 predictions, ProtComp 6 predictions and UniProt annotations with published experimental evidence, if available.

While integrating information from different predictive tools and UniProt annotations with published experimental evidence to decide on the presence of signal peptide or GPI anchor or TM domain in protein sequences, a consensus decision is made based on tool predictions if UniProt annotation with published experimental evidence is not available, else decision is made only on UniProt annotation with published experimental evidence by overriding tool predictions. While integrating information from different predictive tools and UniProt annotation with published experimental evidence to decide on subcellular localization of proteins, the decision is made based on tool predictions using a majority rule if UniProt annotation with published experimental evidence is not available, else decision is made only on UniProt annotation with published experimental evidence by overriding tool predictions.

Funding:

Research in the group of Areejit Samal at The Institute of Mathematical Sciences (IMSc), Chennai is financially supported by Department of Science and Technology (DST), Government of India through the award of a start-up grant (YSS/2015/000060) and Ramanujan fellowship (SB/S2/RJN-006/2014), Max Planck Society, Germany through the award of a Max Planck Partner Group, and intramural funds from Department of Atomic Energy (DAE), Government of India. The funders have no role in study design, prediction, analysis or decision to publish this work.

Contact:

If you have queries regarding our pipeline or datasets, please contact R. P. Vivek-Ananth.