[ensembl-dev] VEP identified refseq transcript and associated protein id

Mon Feb 12 20:26:57 GMT 2018

Hi Devs!

I am trying to determine refseq protein ids from VEP identified refseq transcript ids (given by --xref_refseq). 
To do this I thought I would try to query the "translation" and "transcript" tables of the "otherfeatures" database and merge on transcript_id.
While this works well for nearly all ids, I have seen some refseq transcript ids given by VEP that are missing from the "transcript" table, namely: NM_020973.4 and NM_001288705.1.

I am wondering:
1. If this is a reasonable strategy?
2. If there Is there a table that holds all the refseq transcript ids known to a particular VEP cache?

I am using the standard (not "_refseq" or "_merged") homo_sapien cache version 91 for GRCh37 and the homo_sapien_otherfeatures_91_37 database.

Thanks!
Joey