[ensembl-dev] RegulatoryFeatures in VEP v79?
konradk at broadinstitute.org
Thu May 21 15:17:07 BST 2015
Yeah, I was surprised as well. Removing the histones makes sense: could account for most of the difference, but it went from 1916763 to 0(!). Even in a biased set, would have expected a few by chance.
If you'd like more examples, it's the ExAC dataset (ftp://ftp.broadinstitute.org/pub/ExAC_release/release0.3/ExAC.r0.3.sites.vep.vcf.gz) - this one was annotated with v77 so the ones with RegulatoryFeatures (albeit 2M of them) may give you a starting point.
Thanks for looking into it!
On Thu, May 21, 2015 at 10:02 AM, Daniel Zerbino <zerbino at ebi.ac.uk>
> Wait a minute, *none* of them do? That's something else entirely...
> Now, ExAC is a somewhat biased set in that it is pulled down from exon
> sequencing, and a difference between the old and the new build is that
> regions with the histone marks associated to transcription are no longer
> annotated as "regulatory". This is what you see at your locus 1:13372,
> on e75, where there used to be mostly "gene associated" annotations
> across cell types.
> However, 0 hits across 10M sounds very suspicious. For example, the
> locus you describe happens to be on the edge of a CTCF feature (the
> actual binding site is a bit farther on the 5'), so it should
> technically have been reported.
> We'll investigate...
> On 5/21/15 1:56 PM, Konrad Karczewski wrote:
>> Hi Daniel,
>> Ah ok great thanks! Thing is though: now none of the 10M variants in
>> ExAC overlap RegulatoryFeatures. Is that expected? I would have
>> expected at least a few...
>> On Thu, May 21, 2015 at 2:17 AM, Daniel Zerbino <zerbino at ebi.ac.uk
>> <mailto:zerbino at ebi.ac.uk>> wrote:
>> Hello Konrad,
>> this is because on release 79 we replaced the old regulatory build
>> with the newer version (which we had released for GRCh38 in v76).
>> There would definitely be some moving around of features as both
>> builds are very different in the way they are computed.
>> On 5/21/15 5:28 AM, Konrad Karczewski wrote:
>>> Hi Will, everyone,
>>> Are RegulatoryFeature annotations expected to have the same
>>> results in VEP v79 (GRCh37) as previous versions (e.g. v77)? When
>>> annotating the ExAC VCF, the older versions included many ENSR*
>>> annotations, but v79's do not (same command including
>>> --everything both times). For instance, the following variant
>>> used to overlap ENSR00000528767 but does not seem to in my most
>>> recent version:
>>> 1 13372 . G C
>>> Any idea why this might be happening? All other annotations seem
>>> Dev mailing listDev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog:http://www.ensembl.info/
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Dev