[ensembl-dev] RegulatoryFeatures in VEP v79?

Konrad Karczewski konradk at broadinstitute.org
Fri May 22 20:55:41 BST 2015


Hi all,




I just figured it out - I was using a RankFilter. Strangely, I'd been using the same one in v77 and it wasn't removing them - perhaps the severity ordering changed/wasn't working before? Anyway, makes sense, thanks!

On Fri, May 22, 2015 at 8:06 AM, Anja Thormann <anja at ebi.ac.uk> wrote:

> Hi Konrad,
> I ran the following command using ensembl-tools (release/79) and human cache files homo_sapiens_vep_79_GRCh37.tar.gz.
> perl variant_effect_predictor.pl --cache --offline  -i test_vep_input.txt -o test_vep_output.txt -force_overwrite --no_stats --assembly GRCh37 --regulatory
> where input:
> 1 13372 13372 G/C + testvariant
> and output:
> testvariant 1:13372 C - ENSR00001576075 RegulatoryFeature regulatory_region_variant - - - - - - IMPACT=MODIFIER;BIOTYPE=CTCF_binding_site
> ----------------
> Running not offline and using --everything also returns overlap with a regulatory feature:
> testvariant 1:13372 C - ENSR00001576075 RegulatoryFeature regulatory_region_variant - - - - - - IMPACT=MODIFIER;BIOTYPE=CTCF_binding_site
> Could you please try and refresh all your ensembl APIs and try running the analysis again? At the moment I cannot reproduce the problem.
> Thank you,
> Anja
> On 21 May 2015, at 15:17, Konrad Karczewski wrote:
>> Yeah, I was surprised as well. Removing the histones makes sense: could account for most of the difference, but it went from 1916763 to 0(!). Even in a biased set, would have expected a few by chance.
>> 
>> If you'd like more examples, it's the ExAC dataset (ftp://ftp.broadinstitute.org/pub/ExAC_release/release0.3/ExAC.r0.3.sites.vep.vcf.gz) - this one was annotated with v77 so the ones with RegulatoryFeatures (albeit 2M of them) may give you a starting point.
>> 
>> Thanks for looking into it!
>> 
>> 
>> 
>> On Thu, May 21, 2015 at 10:02 AM, Daniel Zerbino <zerbino at ebi.ac.uk> wrote:
>> 
>> Wait a minute, *none* of them do? That's something else entirely...
>> 
>> Now, ExAC is a somewhat biased set in that it is pulled down from exon sequencing, and a difference between the old and the new build is that regions with the histone marks associated to transcription are no longer annotated as "regulatory". This is what you see at your locus 1:13372, on e75, where there used to be mostly "gene associated" annotations across cell types. 
>> 
>> However, 0 hits across 10M sounds very suspicious. For example, the locus you describe happens to be on the edge of a CTCF feature (the actual binding site is a bit farther on the 5'), so it should technically have been reported. 
>> 
>> We'll investigate...
>> 
>> On 5/21/15 1:56 PM, Konrad Karczewski wrote:
>>> Hi Daniel,
>>> 
>>> Ah ok great thanks! Thing is though: now none of the 10M variants in ExAC overlap RegulatoryFeatures. Is that expected? I would have expected at least a few...
>>> 
>>> -Konrad
>>> 
>>> 
>>> On Thu, May 21, 2015 at 2:17 AM, Daniel Zerbino <zerbino at ebi.ac.uk> wrote:
>>> 
>>> Hello Konrad,
>>> 
>>> this is because on release 79 we replaced the old regulatory build with the newer version (which we had released for GRCh38 in v76).
>>> 
>>> There would definitely be some moving around of features as both builds are very different in the way they are computed.
>>> 
>>> Regards,
>>> 
>>> Daniel
>>> 
>>> On 5/21/15 5:28 AM, Konrad Karczewski wrote:
>>>> Hi Will, everyone,
>>>> 
>>>> Are RegulatoryFeature annotations expected to have the same results in VEP v79 (GRCh37) as previous versions (e.g. v77)? When annotating the ExAC VCF, the older versions included many ENSR* annotations, but v79's do not (same command including  both times). For instance, the following variant used to overlap ENSR00000528767 but does not seem to in my most recent version:
>>>> 
>>>> 1       13372   .       G       C
>>>> 
>>>> Any idea why this might be happening? All other annotations seem fine.
>>>> 
>>>> Thanks!
>>>> -Konrad
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Dev mailing list    Dev at ensembl.org
>>>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>>>> Ensembl Blog: http://www.ensembl.info/
>>> 
>>> 
>> 
>> 
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150522/521526ca/attachment.html>


More information about the Dev mailing list