Biology Reference
In-Depth Information
proteins. Manually all peptides are revised removing those
whose score is equal to the forward-reverse scores. A protein
was considered identifi ed with a minimum of two different
peptides and a score above 20 marked in the SMPW search
engine (Fig. 3 ).
3.13 Functional
Annotation by
Blast2GO Software
The list of proteins identifi ed in a proteomic experiment can be
submitted into Blast2GO (B2G) software [ 43 ] in order to achieve
information mainly about protein description of identifi ed
sequences and its functional annotation based on Gene Ontology
(GO) vocabulary. This functional annotation may predict a puta-
tive function for uncharacterized proteins. Basically, B2G uses
Blast searches to fi nd similar sequences up to several hundred input
sequences at once. Then, the program extracts the GO terms asso-
ciated to each of the obtained protein hits and returns an evalu-
ated annotation for the query sequences. Indeed, B2G is a useful
tool which allows to visualize GO annotations as a reconstructed
structure showing the relationships within Kyoto Encyclopedia of
Genes and Genomes (KEGG) maps.
1. Export protein summary top hits for identifi ed proteins in an
excel fi le.
2. Create a comma-separated value text fi le (.csv fi le) with the
accession number (AN) of the identifi ed proteins in one row.
Edit the fi le in a text processor to be sure that ANs are sepa-
rated by “,”; otherwise replace automatically ';' for ',' and sub-
stitute carriage return features if any by ',' in order to have all
ANs codes in a single line and separated by commas. Save this
fi le as plain text.
3. Retrieve sequences in NCBInr using Batch ENTREZ tool
( http://www.ncbi.nlm.nih.gov/sites/batchentrez ) in FASTA
format using the created fi le.
4. Add *.fasta extension to the retrieved fi le previously to B2G
procedure.
5. Run Blast2GO via Java Web start ( http://www.blast2go.com/
b2glaunch/startblast2go ) using 500 MB memory (see tutorial
http://www.blast2go.com/b2glaunch ) .
6. Load every FASTA fi le (.fasta) by selecting protein sequence
fi le. The unique sequences of the list will be read by the
software adding information about the sequence length.
7. Run Blast step with default features ( e -value cutoff 1 × 10 −50 ,
number of Blast hits: 20) and selecting Blastp search in NCBInr
database ( see Note 34 ).
8. Once Blast step is fi nished run GO-mapping step. Automatically
B2G tool retrieves GO terms to the loaded sequences.
9. In the next step, annotation is performed automatically using
default parameters: e -value Hit-Filter of 1 × 10 −6 , an Hsp-Hit
Search WWH ::




Custom Search