[ensembl-dev] VEP vcf annotation
Dietmar Rieder
dietmar.rieder at i-med.ac.at
Fri Jul 2 12:06:07 BST 2021
Thanks Diana,
I completely missed that, it makes of course sense, sorry for the noise.
Best
Dietmar
On 7/2/21 1:01 PM, Diana Lemos wrote:
> Thanks for sharing the command.
>
> Your commands are not the same, to generate the VCF output you have the
> option --pick which is going to pick one consequence per variant
> according to the criteria described here:
> https://www.ensembl.org/info/docs/tools/vep/script/vep_other.html#pick
> <https://www.ensembl.org/info/docs/tools/vep/script/vep_other.html#pick>
>
> This option is not being used in your second command. If you remove
> --pick you should have the same number of consequences in both outputs.
>
>
> Best wishes,
>
> Diana
>
>
> On 02/07/2021 11:45, Dietmar Rieder wrote:
>> Hi,
>>
>> here is the command for the vcf output:
>>
>> vep -i CRC15_CRC15_normal_Somatic.hc.vcf.gz \
>> -o CRC15_CRC15_normal_tumor_vep.vcf \
>> --fork 16 \
>> --stats_file CRC15_CRC15_normal_tumor_vep_summary.html \
>> --species homo_sapiens \
>> --assembly GRCh38 \
>> --offline \
>> --cache \
>> --cache_version 103 \
>> --dir /data/databases/vep_cache \
>> --dir_cache /data/databases/databases/vep_cache \
>> --hgvs \
>> --fasta
>> /data/databases/vep_cache/homo_sapiens/103_GRCh38/Homo_sapiens.GRCh38.dna.toplevel.fa.gz
>> \
>> --pick --plugin Frameshift --plugin Wildtype \
>> --plugin
>> ProteinSeqs,CRC15_CRC15_normal_tumor_reference.fa,CRC15_CRC15_normal_tumor_mutated.fa
>> \
>> --symbol --terms SO --transcript_version --tsl \
>> --vcf 2> vep_errors_1.txt
>>
>>
>>
>> and this is the command for the table output:
>>
>> vep -i CRC15_CRC15_normal_Somatic.hc.vcf.gz \
>> -o CRC15_CRC15_normal_hc_vep.txt \
>> --fork 16 \
>> --stats_file CRC15_CRC15_normal_hc_vep_summary.html \
>> --species homo_sapiens \
>> --assembly GRCh38 \
>> --offline \
>> --dir /data/databases/vep_cache \
>> --cache \
>> --cache_version 103 \
>> --dir_cache /data/databases/vep_cache \
>> --fasta
>> /data/databases/vep_cache/homo_sapiens/103_GRCh38/Homo_sapiens.GRCh38.dna.toplevel.fa.gz
>> \
>> --format "vcf" \
>> --everything \
>> --tab 2> vep_errors.txt
>>
>> Best
>> Dietmar
>>
>> On 7/2/21 12:18 PM, Diana Lemos wrote:
>>> Hi Dietmar,
>>>
>>> I'm unable to reproduce the issue. Could you please send me the VEP
>>> command you are running?
>>>
>>>
>>> Thanks
>>>
>>> Diana
>>>
>>>
>>> On 02/07/2021 10:52, Dietmar Rieder wrote:
>>>> Hi,
>>>>
>>>> we are using VEP (103) to annotat our VCFs and we just stumbled over
>>>> the situation that for the mutation chr5_112838250_C/T
>>>> (chr5:112838250) we get 7 annotated transcript variants in the gene
>>>> with SYMBOL ACP and one in the "gene" with SYMBOL AC008575.1, in the
>>>> VEP txt output, which is fine.
>>>>
>>>> BUT
>>>>
>>>> when we use -vcf to get an annotated vcf file we get the mutation on
>>>> that position only annotated with the SYMBOL AC008575.1
>>>> This is problematic, because the canonical gene here is APC (a known
>>>> driver gene in CRC) and we miss it when parsing the VCF
>>>>
>>>> Would it be possible to add all gene symbols to the SYMBOL field in
>>>> the CSQ of the vcf?
>>>>
>>>> Thanks
>>>> Dietmar
>>>>
>>>>
>>>> _______________________________________________
>>>> Dev mailing listDev at ensembl.org
>>>> Posting guidelines and subscribe/unsubscribe
>>>> info:https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org
>>>> Ensembl Blog:http://www.ensembl.info/
>>
>>
--
_________________________________________
D i e t m a r R i e d e r, Mag.Dr.
Head of HPC/Bioinformatics facility
Innsbruck Medical University
Biocenter - Institute of Bioinformatics
Innrain 80, 6020 Innsbruck
Phone: +43 512 9003 71402
Fax: +43 512 9003 73100
Email: dietmar.rieder at i-med.ac.at
Web: http://www.icbi.at
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 665 bytes
Desc: OpenPGP digital signature
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20210702/353b94cc/attachment.sig>
More information about the Dev
mailing list