[ensembl-dev] VEP print summary of cache database

Will McLaren wm2 at ebi.ac.uk
Fri Apr 8 11:22:06 BST 2016


Hi Hans,

This isn't something that has been requested before. It's not a simple task
as the the gzipped files you have observed are serialized Perl objects, and
there are issues such as duplications across coordinate boundaries to
resolve. It's something we can look at including in future versions.

You should be able to get a good approximation by looking at the counts on
the species summary page:
http://plants.ensembl.org/Triticum_aestivum/Info/Annotation/#assembly (or
look at the archive version of this page as appropriate)

I wouldn't guarantee these would represent the exact same counts as those
in VEP, as there may be slightly different inclusion criteria.

Hope that helps

Will McLaren
Ensembl Variation

On 8 April 2016 at 11:12, Hans Vasquez-Gross <havasquezgross at ucdavis.edu>
wrote:

> I'm using VEP to annotate some VCF files using an offline cache database.
> The summary file lets me know the number of overlapped genes/transcripts.
> However, it doesn't say how many total genes/transcripts in the database
> which would be useful for some calculations.
>
> To annotate, I use the following command to run VEP:
> ./variant_effect_predictor.pl -species triticum_aestivum -i input.vcf -o
> output.vep.vcf --fork 4 --offline --db_version 22
>
> I've been to the cache directory: .~/vep/triticum_aestivum/22, and tried
> looking at the storage structure. I saw these are gzipped files within
> directories for each contig.
>
> Is there an easy way to get a list of all transcripts/genes in this
> database? Thank you.
>
> Cheers
> -Hans
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20160408/a5dfa209/attachment.html>


More information about the Dev mailing list