[ensembl-dev] question variation APi
Nathalie Conte
nconte at ebi.ac.uk
Tue Oct 7 15:55:24 BST 2014
typo
> I have tried using FTP but as you predicted
> I have tried using API but as you predicted
On 7 Oct 2014, at 15:45, Nathalie Conte <nconte at ebi.ac.uk> wrote:
> HI Will,
> Thanks for your answer.
> I would like to have programmatic access to this data as this will be saved in a data warehouse containing other data pulled from other databases which would be updated automatically in the future.To manually retrieve the data using ftp may not be sustainable for these reasons - I have tried using FTP but as you predicted , I have issues with memory and this is clearly not optimal. Would a direct sql query to ensembl database work?
> Thanks for any advice
> Nathalie
>
>
> On 7 Oct 2014, at 13:52, Will McLaren <wm2 at ebi.ac.uk> wrote:
>
>> Hi Nathalie,
>>
>> We wouldn't recommend using the API to retrieve all of the rsIDs; there are >60million and the API is not optimised for retrieving the whole dataset in this way.
>>
>> Instead I'd recommend you extract the IDs from one of our dump files; probably VCF or GVF would be the easiest to work with:
>>
>> curl ftp://ftp.ensembl.org/pub/release-77/variation/vcf/homo_sapiens/Homo_sapiens.vcf.gz | zcat | grep -v # | cut -f 3 | head
>>
>> (remove the head and redirect to a file to get all of them).
>>
>> The somatic mutations are in a separate file, ftp://ftp.ensembl.org/pub/release-77/variation/vcf/homo_sapiens/Homo_sapiens_somatic.vcf.gz
>>
>> To answer your question, to fetch somatic mutations use fetch_all_somatic() (see http://www.ensembl.org/info/docs/Doxygen/variation-api/classBio_1_1EnsEMBL_1_1Variation_1_1DBSQL_1_1VariationAdaptor.html#a22e69dacdd77542463320a1ef16b151f)
>>
>> Regards
>>
>> Will McLaren
>> Ensembl Variation
>>
>> On 7 October 2014 10:38, Nathalie Conte <nconte at ebi.ac.uk> wrote:
>> Hi,
>> I would like to get all variation ID (ie rs1822893 )from ensembl, I am using the variation API to do so.
>> Is it the best way? the fetch_all method seems to get all germline variation, is there another method for somatic ones?
>>
>> my $vf_adaptor = Bio::EnsEMBL::Registry->get_adaptor('human', 'variation', 'variationfeature');
>> my @vfs = @{$vf_adaptor->fetch_all()};
>> foreach my $vf(@vfs){
>> if ($vf){
>> my $varID=defined($vf->variation_name) ? $vf->variation_name :'No_variation';
>> if ($varID) {
>> print "$varID\n";
>> }
>> }
>> }
>>
>>
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20141007/daddb416/attachment.html>
More information about the Dev
mailing list