Summary

Tag

eugene

Owner

VIB GENT

Input

seq, extrinsic data (EST, BlastX, TBlastX, repeats), and other ab-initio predictions

Output

EuGene predictions in GFF3 format, protein, cds, cDNA sequences in FASTA format

This analysis runs the EuGene gene prediction on BAC sequences using extrinsic data and ab-initio predictions from other gene finders, and produces the prediction as an GFF3 output, and the protein, cds, & cDNA sequences of each predicted genes in separate FASTA files.

Input

BlastX

TBlastX

Protein Mapping

EuGene

Output

Filenames

Gene Descriptions

The gene name and description line basically follows the Guidelines Doc, although the functional description is not included.
We have also modified the 'Evidence Code' by expanding the single letter code used in Medicago into a multiple letter tag and adding some extra information. This 'structural tag' is currently being discussed within our group, and changes may occur in the future.
For the moment, the tag looks like - XXF()H()E()I()L(), e.g. 08F0H1E0IEGL1;

XX

Two digits to describe the year the tag was assigned - e.g. '08'

F

Whether expressed sequences covering the translation start to translation stop (FL-cDNA or a combination of multiple ESTs) was used to derive the gene model or not - F0 or F1

H

Whether protein-similarity information was used to derive the gene model or not - H0 or H1

E

Whether similarity to expressed sequences was used to derive the CDS of the gene model or not - E0 or E1

I

Two letters describing the program used to generate the gene model, in this case EuGene - IEG

L

A single digit (0~9) describing the length of the CDS - 0(0-150nt),1(151-300),2(301-600),3(601-1200),4(1201-1800),5(1801-2400),6(2401-4500),7(4501-15000),8(15001-30000),9(30000-)

AnEugene000 (last edited 2008-05-23 17:29:45 by psbpc059)