[ensembl-dev] VEP workings

Venugopal Valmeekam vvalmeekam at yahoo.com
Wed May 8 14:51:16 BST 2013


Hi,
We are using VEP in our organization to build a comprehensive database of human mutation consequences.  I am using the refseq and ensembl cache files to run VEP.  I would really appreciate if you could answer the following questions:
1. Is VEP using the reference genome (vs mRNA sequence) to derive the amino acid sequence for a particular transcript? I do see several examples of refseq proteins, where the amino acid sequence from VEP interpretation is different compared to the refseq protein sequence.
2. There are several cases of RefSeq proteins with special amino acids such as selenocysteine or "U" coded by a stop codon "UGA".  I see that VEP makes accurate calls at these positions.  Is VEP somehow using the protein sequence to make these calls?
3. For some refseq proteins e.g.NM_020469 (NP_065202) VEP interpretation has pre-terminal stop codons.  These seem to correspond to indels in the reference genome.  However, I do not see such instances in the Ensembl collection.  could you please let me know if VEP is using different approach for these two collections?
4. What cutoffs does VEP use to establish "downstream"/"upstream" variants?  for e.g. 2kb upstream/2kb downstream ?
5. Since mitochondrial codons for several amino acids is different compared to the nuclear codons, does VEP use the mitochondrial codon table to translate mitochondrial transcripts?
Thanks for providing such a wonderful resource.
Venu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20130508/db373cb7/attachment.html>


More information about the Dev mailing list