[ensembl-dev] Phase 3 data from 1000 genomes
wm2 at ebi.ac.uk
Thu Oct 23 16:45:13 BST 2014
Yes; we hope to have the data available in an upcoming Ensembl release. The
latest dbSNP release (142) includes the data, though there is some latency
in this appearing downstream in the Ensembl Variation database as dbSNP's
release dates are not always compatible with our production schedule for
There is also the issue that 1000 genomes issued their data on GRCh37,
while our current assembly for human is GRCh38 (we provide a mirror service
on GRCh37 but the data powering that is currently frozen on Ensembl release
The 1000 genomes browser is Ensembl powered and the most recent update
includes the phase 3 data, e.g.
You could also use the VEP and its custom annotation feature to incorporate
allele frequency data from the 1000 genomes VCF files, e.g.
perl variant_effect_predictor.pl -i example_GRCh37.vcf -offline -assembly
GRCh37 -force -custom
to retrieve the allele frequency (AF) info field for overlapping variants,
Note this example uses tabix's remote functionality to query the file on a
remote server; please spare network resources and download the file to your
local file system if you intend to use it for more than small tests like
On 23 October 2014 16:19, Genomeo Dev <genomeodev at gmail.com> wrote:
> Hi Will,
> Is there an intent to incorporate data from phase 3 data from 1000 genomes
> in the near future?
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> Ensembl Blog: http://www.ensembl.info/
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Dev