[ensembl-dev] VEP using CADD plugin

Will McLaren wm2 at ebi.ac.uk
Thu Jul 3 14:38:47 BST 2014


Hi Eva,

You should get the data file and index from the line:

All possible SNVs of GRCh37/hg19   [ file (79G)
<http://krishna.gs.washington.edu/download/CADD/v1.0/whole_genome_SNVs.tsv.gz>
 | tabix index (2.7M)
<http://krishna.gs.washington.edu/download/CADD/v1.0/whole_genome_SNVs.tsv.gz.tbi>
 ]

Regards

Will McLaren
Ensembl Variation


On 3 July 2014 14:19, Eva Goncalves Serra <egs at sanger.ac.uk> wrote:

>  Hi,
>
>  I am a bit confused as to which file from
> http://cadd.gs.washington.edu/download I should download to use the
> plugin of CADD scores.
>
>  Any help would be appreciated.
>
>  Thanks!
>
>  Eva
>
>   From: Will McLaren <wm2 at ebi.ac.uk>
> Reply-To: Ensembl developers list <dev at ensembl.org>
> Date: Wednesday, 7 May 2014 16:13
> To: Ensembl developers list <dev at ensembl.org>
> Subject: Re: [ensembl-dev] VEP using CADD plugin
>
>   Hello,
>
>  Correct, the plugin was intended to work with the whole_genome_SNVs.tsv
> file, which only contains data for SNVs.
>
>  I've modified the plugin so that it should be able to cope with indel
> data files such as you have; please do let me know if you have any problems
> as I've only sparingly tested it on made-up data!
>
>  Regards
>
>  Will McLaren
> Ensembl Variation
>
>
> On 7 May 2014 15:37, Genomeo Dev <genomeodev at gmail.com> wrote:
>
>> Hi,
>>
>>  There seem to be a discrepancy between the CADD score calculated using
>> VEP with the CADD.pm plugin and the tabix direct output:
>>
>>  For example using this 1000G variant:
>>
>>  #CHROM POSID REFALT QUALFILTER INFO
>> 7 86214932rs140931361 TTACTCT .PASS .
>>
>>  variant_effect_predictor.pl -i input.txt --format vcf --plugin
>> CADD,/media/sf_D_DRIVE/Projects/Databases/CADD/v1.0/1000G.tsv.gz
>>  does not return any CADD score
>>
>>  whereas
>>  $ tabix -p vcf 1000G.tsv.gz 7:86214932-86214932
>> 7 86214932TTACTC T-0.420243 2.040
>>
>>  This seems to affect indels and not SNVs. I could see in the plugin
>> that there is a rule to ignore indels. Any suggestions please how to safely
>> change that?
>>
>>  Also, in the plugin, I assume there is a test to ensure the alleles are
>> identical between the input file and the 1000G.tsv.gz file. Is this correct?
>>
>>  Thanks.
>>
>>  --
>> G.
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20140703/4e9e9764/attachment.html>


More information about the Dev mailing list