[ensembl-dev] would ENSEMBL kindly host merged and filtered vcf files for gnomad2.1
seh at ebi.ac.uk
Tue May 14 14:32:41 BST 2019
Thanks for you kind words about Ensembl.
Ensembl is not an archive - the data files on our FTP site are used in
our services or created from our databases for each release - so we
would not be best placed to host these files. Have you considered
contacting the gnomAD team about hosting your slimmed down files?
On 13/05/2019 03:44, Sergey Naumenko wrote:
> Dear Ensembl developers!
> Thank you for all your great work!
> Gnomad 2.1. is a major update of Gnomad database of variation in the
> human population
> (whole exome and whole genome sequencing).
> We are using Ensembl hosted Gnomad vcf files in cloudbiolinux and bcbio.
> chapmanb/cloudbiolinux <https://github.com/chapmanb/cloudbiolinux>
> bcbio/bcbio-nextgen <https://github.com/bcbio/bcbio-nextgen>
> There is a difference between gnomad2.0.1 files and gnomad2.1 - they
> are split into chromosomes:
> Index of
> To use gnomad2.1 in the annotation step of bcbio (we annotate with
> vcfanno), we decided to merge the files
> and remove a number of INFO fields to reduce the file size, see the
> discussion here:
> Using gnomad2.1: request for opinions · Issue #2736 ·
> bcbio/bcbio-nextgen <https://github.com/bcbio/bcbio-nextgen/issues/2736>
> We created recipes in cloudbiolinux to merge gnomad2.1 vcfs for
> grch37, grch38, and hg19.
> However, the long running time makes merging gnomad vcf files in every
> local installation not feasible.
> We decided to generate merged files once, and then provide users with
> easy to install recipe.
> Would you kindly agree to host merged vcfs for gnomad exome and genome
> for grch37 and grch38 on ENSEMBL FTP server?
> We would be happy to produce the files and upload them.
> The technical steps on how we merge the vcfs are listed in the recipe:
> we sort the variants, filter only PASS variants, keep the pre-defined
> subset of INFO fields, etc.
> We hope that many of Ensembl users would benefit
> from the merged and relatively slim gnomad2.1 vcf files,
> and we are happy to share our work with Ensembl.
> Sergey Naumenko
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://mail.ensembl.org/mailman/listinfo/dev_ensembl.org
> Ensembl Blog: http://www.ensembl.info/
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Dev