[ensembl-dev] pseudogenes in ensembl bacteria
Dan Staines
dstaines at ebi.ac.uk
Wed Aug 21 15:57:46 BST 2013
On 08/21/2013 03:42 PM, Adam Witney wrote:
>>
>
> Actually it looks like they are generally within the
> $simple_feature->display_label() field, but in some cases this seems to
> have been truncated? e.g. gene_feature:1633
If you look at the fasta header generated by this script, its of this form:
><id> <location string> <description>
id is the gene stable ID for those genes loaded as "proper" Ensembl gene
- this is the same as the locus_tag from the INSDC entry. In the cases
of the subset of pseudogenes which not loaded as full genes, but as
simple_features, we don't have the locus_tags stored anywhere in the
database. In this case, id is gene_feature:<dbid> where <dbid> is the
internal surrogate key (simple_feature.simple_feature_id) used by
Ensembl. As I said, you can modify this format easily enough.
Dan.
--
Dan Staines, PhD
Technical Coordinator, Ensembl Genomes
European Bioinformatics Institute (EMBL-EBI)
http://www.ebi.ac.uk/
http://www.ensemblgenomes.org/
More information about the Dev
mailing list