[ensembl-dev] multiple vega/havana entries

Magali mr6 at ebi.ac.uk
Fri Oct 4 09:34:12 BST 2013


Hi Daniel,

These cross-references come from the Vega database which contains the
Havana annotation.
They are added into our ensembl database during the merge process.

The duplication comes from the difference in display_ids.
A first xref is used to store the actual Vega display (OTT) while the
second one allows the storage of the display name used by Havana (MAPK14)

Because each stored entry gets an internal id, that also explains the
different dbIDs.
These are for internal storage only and should not be used directly.

If you want unique entries, you can use an alternative source called
'OTTT' (for OTT transcript).
With the same code as below, replacing 'Vega_transcript' with 'OTTT',
you should be able to get the results you are looking for.


Hope that helps,
Magali

On 03/10/13 22:36, Caffrey, Daniel wrote:
> Dear Ensembl developers,
>
> I want to use the API to retrieve havana Ids for ensembl transcripts. I have been using the get_all_DBEntries('Vega_transcript') method.  However,  I get 2 entries for a single transcript  that are essentially identical (They both refer to OTTHUMT00000357449). There are subtle differences (the dbIDs and the display_ids differ, see output below). 
>
> My questions: 
> 1) Does anyone know why transcripts  are cross-referenced to  two havana/vega databases?  
> 2) What is the  difference between the two databases (dbID: 669630 and  dbID: 669633) and is one more preferable to use than the other?
>
> Thanks for your help!
> Daniel 
>
> Relevant code snippets:
>
> API Version 73
>
> my @dbEntries = @{ $transcript->get_all_DBEntries('Vega_transcript') };
> foreach my $dbEntry(@dbEntries){
> my $db=$dbEntry->dbname(); #e.g. Vega_transcript
>              my $id=$dbEntry->display_id(); # e.g. OTTHUMT00000357450 MAPK14-001 
>              my $primaryId=$dbEntry->primary_id(); # e.g. OTTHUMT00000357450 OTTHUMT00000357449
>              my $dbType=$dbEntry->type(); #e.g. MISC
>              my $description=$dbEntry->type(); #this is set to MISC for havana
>              my $dbDisplayName=$dbEntry->db_display_name(); #e.g. Vega_transcript
>              my $desc=$dbEntry->description();
>              my $dbId=$dbEntry->dbID();
>               print OUT "get_all_DBEntries dbID: $dbId db_display_name: $dbDisplayName  dbname: $db  primary_id  $primaryId displayId: $id type $dbType\n";
>  }
>
>
>
> Output for ENST00000229795 ENSG00000112062
>
> get_all_DBEntries dbID: 669630 db_display_name: Vega transcript  dbname: Vega_transcript  primary_id  OTTHUMT00000357449 displayId: OTTHUMT00000357449 type MISC
> get_all_DBEntries dbID: 669633 db_display_name: Vega transcript  dbname: Vega_transcript  primary_id  OTTHUMT00000357449 displayId: MAPK14-001 type MISC
>
>
>
>
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/





More information about the Dev mailing list