[ensembl-dev] Question regarding MAF frequencies from VEP
Will McLaren
wm2 at ebi.ac.uk
Fri May 29 14:51:49 BST 2015
Hello,
As you've seen, this data can be somewhat confusing.
The GMAF field always reports the minor allele frequency, whereas the other
frequency fields report the frequencies of the ALT (or ALTs if there is
more than one).
Ideally the VEP would report the frequency of the ALT allele that you input
in your VCF, but this raises further problems if the ALT allele you report
does not match either the REF or ALT alleles from the 1000 Genomes VCF. It
is something we are hoping to improve in a future VEP release.
For your second question, it looks like frequencies have been mistakenly
assigned to the two reported co-located variants (rs3902057 and
RISN_CRB1:c.1410A>G),
so the frequencies appear twice. We'll look into a fix for this.
Regards
Will McLaren
Ensembl Variation
On 29 May 2015 at 14:29, Svein Tore Koksrud Seljebotn <
s.t.seljebotn at medisin.uio.no> wrote:
> Hi,
>
> I am trying to figure out some of the output I get from VEP (version 79)
> when annotating vcf files. See end of email for input and command. Please
> note, I am new to this field, so I might misunderstand a few concepts...
>
> For the variant (1 197390368 rs3902057 A G) I get the following
> output:
>
> CSQ=G|upstream_gene_variant|MODIFIER|CRB1|ENSG00000134376|Transcript|ENST00000480086|processed_transcript||||||||||rs3902057&RISN_CRB1:c.1410A>G|1|1573|1|HGNC|2343||||||||A:0.0803|G:0.7065&G:0.7065|G:0.9813&G:0.9813||G:1&G:1|G:0.999&G:0.999|G:1&G:1|G:0.7696&G:0.7696|G:0.9986&G:0.9986|||19339744||||
> {rest of transcripts omitted...}
>
> - This might be a silly question, but why is GMAF given for REF, while the
> subpopulations are given for ALT? In my case I'm interested in the
> frequency for the ALT, not the REF. I assume it's giving the minor allele
> frequency always? But why is there a difference in the allele given for
> GMAF vs e.g. AFR_MAF?
>
> Looking at a later transcript for same variant, I see the following:
>
>
> G|synonymous_variant|LOW|CRB1|23418|Transcript|NM_001193640.1|protein_coding|4/10||NM_001193640.1:c.1074A>G|NM_001193640.1:c.1074A>G(p.=)|1283|1074|358|L|ctA/ctG|rs3902057&RISN_CRB1:c.1410A>G|1||1|||||NP_001180569.1|rseq_mrna_nonmatch&rseq_cds_mismatch&rseq_ens_match_cds||||A:0.0803|G:0.7065&G:0.7065|G:0.9813&G:0.9813||G:1&G:1|G:0.999&G:0.999|G:1&G:1|G:0.7696&G:0.7696|G:0.9986&G:0.9986|||19339744||||,G|5_prime_UTR_variant|MODIFIER|CRB1|ENSG00000134376|Transcript|ENST00000367397|protein_coding|2/6||ENST00000367397.1:c.-448A>G||411|||||rs3902057&RISN_CRB1:c.1410A>G|1||1|HGNC|2343|||ENSP00000356367|||||A:0.0803|G:0.7065&G:0.7065|G:0.9813&G:0.9813||G:1&G:1|G:0.999&G:0.999|G:1&G:1|G:0.7696&G:0.7696|G:0.9986&G:0.9986|||19339744||||
>
> - Why is the frequency for the subpopulation alleles repeated twice with
> same value? Why not always give the frequency for all alleles?
>
>
> Best regards,
> Svein Tore Koksrud Seljebotn
>
>
>
>
> **** Example VCF: *****
>
> ##fileformat=VCFv4.1
> ##INFO=<ID=class,Number=.,Type=String,Description="class">
> ##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
> #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT H02
> 1 197390368 rs3902057 A G 7128.77 .
> AC=2;AF=1.00;AN=2;DB;DP=193;Dels=0.00;FS=0.000;HaplotypeScore=4.6974;MLEAC=2;MLEAF=1.00;MQ=70.00;MQ0=0;QD=29.21
> GT:AD:DP:GQ:PL 1/1:0,192:193:99:7157,518,0
>
> ***** Command: *****
> vep --cache --dir_cache=/work/VEP/cache/
> --fasta=/work/human_g1k_v37_decoy.fasta --offline --sift=b --polyphen=b
> --ccds --hgvs --numbers --domains --regulatory --canonical --protein
> --biotype --gmaf --maf_1kg --maf_esp --pubmed --allow_non_variant --fork=4
> --vcf --allele_number --no_escape --failed=1 --no_stats --merged --symbol
> -i testfile.vcf -o testfile.annotated.vcf
>
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150529/9cbd3814/attachment.html>
More information about the Dev
mailing list