[ensembl-dev] from hgvs to vcf: Gene set information (genePred, RefFlat)

Guillermo Marco Puche guillermo.marco at sistemasgenomicos.com
Mon Dec 23 16:35:19 GMT 2013


Dear all,

I'm trying to translate the HGVS data I'm getting in my annotations with 
the Ensembl Database to VCF format, so that I can assign a VCF 
alternative allele to a Ensembl annotated consequence. Please consider 
the following example:

chr1    154164465    .    C *A,G*    0.04    SNP_AF 
AC=1,1;AF=0.125,0.125;AN=8;BaseQRankSum=-0.769;DP=34;Dels=0.0;FS=0.0;HaplotypeScore=0.0;MLEAC=1,1;MLEAF=0.125,0.125;MQ=60.0;MQ0=0;MQRankSum=-0.329;QD=0.0;ReadPosRankSum=0.256;SDP=11;SFREQ=0.111;set=FilteredInAll;CSQ=*TPM3|ENSG00000143549|tropomyosin_3|**ENST00000515609.1:c.30G>T*||2/3|ENSP00000426306.1:p.Gln10His|missense_variant|||||||||||Transcript|ENST00000515609||||3.270|24|deleterious(0.798)|deleterious(0)|possibly_damaging(0.896)|Coiled-coils_(Ncoils):ncoils|||||TCAGCTTGCTCTGCCCGATCCAGAGCATTCTCCTTGTCTAACTTCAGCAT[C/A&G]TGCATCTTTTTCTTGATGGCCTCCATCATGAGCAGTGGCTGTTGGTAGGC 
GT:AD:DP:FREQ:GQ:PL    0/2:8,0,1:9:0,0.111:10:10,34,307,0,273,270 
0/0:4,0,0:4:0,0:12:0,12,150,12,150,150 
0/1:9,1,0:10:0.1,0:11:11,0,287,36,290,326 
0/0:11,0,0:11:0,0:30:0,30,398,30,398,398

 From the HGVS data (*ENST00000515609.1:c.30G>T*) I can use existent 
hgvs code libraries for obtaining the corresponding VCF-formatted 
variant (*chr1 154164465 C > A*), and thus being able to select the 
alternative allele (*A*) relative to the consequence. The problem I'm 
finding here, is that I need to retrieve the gene set information, also 
including transcript versions (ENST00000515609*.1*) since different 
versions may yield different results in the transformation from 
transcript coordinates to genomic coordinates. The only resource I've 
found in Ensembl is the gene set in .gtf format 
(ftp://ftp.ensembl.org/pub/release-74/gtf/homo_sapiens/Homo_sapiens.GRCh37.74.gtf.gz), 
but no transcript information is available in this file. Is there any 
other file containing this information (genePred or RefFlat, for 
example)?? Any other hints??

Thank you in advance,
-- 
Guillermo Marco Puche
------------------------------------------------------------------------

Guillermo Marco Puche
Bioinformatician, Computer Science Engineer.
Sistemas Genómicos S.L.
Phone: +34 902 364 669
Fax: +34 902 364 670
www.sistemasgenomicos.com

	

<https://www.sistemasgenomicos.com/web_sg/web/areas-bioinformatica.php>

------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20131223/a1dbaa39/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bioinfo.png
Type: image/png
Size: 27377 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20131223/a1dbaa39/attachment.png>


More information about the Dev mailing list