[ensembl-dev] question variation APi
Nathalie Conte
nconte at ebi.ac.uk
Tue Oct 7 15:45:39 BST 2014
HI Will,
Thanks for your answer.
I would like to have programmatic access to this data as this will be saved in a data warehouse containing other data pulled from other databases which would be updated automatically in the future.To manually retrieve the data using ftp may not be sustainable for these reasons - I have tried using FTP but as you predicted , I have issues with memory and this is clearly not optimal. Would a direct sql query to ensembl database work?
Thanks for any advice
Nathalie
On 7 Oct 2014, at 13:52, Will McLaren <wm2 at ebi.ac.uk> wrote:
> Hi Nathalie,
>
> We wouldn't recommend using the API to retrieve all of the rsIDs; there are >60million and the API is not optimised for retrieving the whole dataset in this way.
>
> Instead I'd recommend you extract the IDs from one of our dump files; probably VCF or GVF would be the easiest to work with:
>
> curl ftp://ftp.ensembl.org/pub/release-77/variation/vcf/homo_sapiens/Homo_sapiens.vcf.gz | zcat | grep -v # | cut -f 3 | head
>
> (remove the head and redirect to a file to get all of them).
>
> The somatic mutations are in a separate file, ftp://ftp.ensembl.org/pub/release-77/variation/vcf/homo_sapiens/Homo_sapiens_somatic.vcf.gz
>
> To answer your question, to fetch somatic mutations use fetch_all_somatic() (see http://www.ensembl.org/info/docs/Doxygen/variation-api/classBio_1_1EnsEMBL_1_1Variation_1_1DBSQL_1_1VariationAdaptor.html#a22e69dacdd77542463320a1ef16b151f)
>
> Regards
>
> Will McLaren
> Ensembl Variation
>
> On 7 October 2014 10:38, Nathalie Conte <nconte at ebi.ac.uk> wrote:
> Hi,
> I would like to get all variation ID (ie rs1822893 )from ensembl, I am using the variation API to do so.
> Is it the best way? the fetch_all method seems to get all germline variation, is there another method for somatic ones?
>
> my $vf_adaptor = Bio::EnsEMBL::Registry->get_adaptor('human', 'variation', 'variationfeature');
> my @vfs = @{$vf_adaptor->fetch_all()};
> foreach my $vf(@vfs){
> if ($vf){
> my $varID=defined($vf->variation_name) ? $vf->variation_name :'No_variation';
> if ($varID) {
> print "$varID\n";
> }
> }
> }
>
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20141007/2bc1a792/attachment.html>
More information about the Dev
mailing list