domain annotation of Trimeric Autotransporter Adhesins

About daTAA

Domain Annotation in Trimeric Autotransporter Adhesins is a server that aims at providing detailed information on the modular structure of this distinct class of adhesins. Due to their modularity, classic approaches fail in the annotation - the blocks building these proteins are often too small to be correctly annotated, or their sequence divergence is so large that they appear as unrelated. daTAA server addresses all these issues, with knowledge-based approach and manually curated profiles.

 

Organization of trimeric autotransporter adhesins

Fig.1.A shows a general head-stalk-anchor organization of TAAs (trimeric autotransporter adhesins), where the head mediates binding, the stalk projects the head above the membrane, and the anchor secretes previous two segments. Bacteria developed a battery of domains that constitute the head and the stalk. Structural differences between these two segments (the head being globular and the stalk forming coiled-coil fiber) require a presence of additionall class of domains called connectors. Domains of this class mediate transition from the head to the stalk and reverse, since many longer TAAs have their head and stalk regions interlaced. Possible architectures of domains are presented on the Fig. 1.B.

 

 


Fig. 1.A. A general organization of TAAs. B. Possible connections between TAA domains.

 

Browsing the database

The "browse" option from the menu leads to two tables. The first one contains all domains we identified in TAAs, color-coded according to their structural function: heads are in red, stalks in yellow, connectors in green, the membrane anchor in grey and the autotransporter signal peptide is in blue. Next to the domain's name its brief description is provided. The second table contains so called "beta" domains. These are domains for which we do not have enough information to make sure that our definition is correct - they have very small species/protein spectrum.
The single domain page is composed of a description, a plot showing an average hydrophobicity and a side chain size from the alignment, and a species distribution. An outline of the species distribution contains speciesi names that have at least one trimeric autotransporter adhesin with the membrane anchor. When it was possible we included a small picture of domain's quaternary structure.

 

Annotation of submitted sequence

On the "search" page user is required to submit protein sequence in RAW format (FASTA format is also supported, but we do not encourage submission in this format as the header is deleted anyway). For the comfort of new users, example sequence is also provided.
The results page is divided into a few sections. The first section contain an overview of the domain structure of the annotated protein. This section requires Java plugin to be installed in the user's web browser. Domains are colored according to their structural function (the same as in the "browse" view) and the same domains are connected with arcs. The second section contains detailed annotation of the submitted sequence. Left side is occupied by a three panel image containing domain annotation, coiled-coil prediction (from Marcoil) and a schematic representation of the domain classes (narrow lines representing stalks, wide representing heads and angled lines usually joining previous two representing connectors). Holding a mouse pointer over the domain picture provides a sticky tooltip containing the alignment of the query and the hit. On the right side of the image there is a table containing sequence ranges and E-values of found domains. Coiled-coils do not have E-values by default, and assignments derived from the knowledge based rules have the word "Rules" in the E-value field. The last section contains small form providing direct submission of the first 100 residues of the submitted sequence to the SignalP server of Center for Biological Sequence Analysis of Technical University of Denmark. We added this form as there are many cases that TAAs sequences available in the databases are frameshifted (these may be due to real genetic events or misannotation). Confirming presence of a signal peptide can help in identification of these cases.

 

daTAA uses

daTAA server was developed using the following tools: HMMER, Marcoil, BioPerl, Processing, ImageMagick, SQLite.