[ensembl-dev] question variation APi

Nathalie Conte nconte at ebi.ac.uk
Tue Oct 7 15:45:39 BST 2014


HI Will, 
Thanks for your answer.
I would like to have programmatic access to this data as this will be saved in a data warehouse containing other data pulled from other databases which would be updated automatically in the future.To manually retrieve the data using ftp may not be sustainable  for these reasons - I have tried using FTP but as you predicted , I have issues with memory and this is clearly not optimal. Would a direct sql query to ensembl database work?
Thanks for any advice 
Nathalie


On 7 Oct 2014, at 13:52, Will McLaren <wm2 at ebi.ac.uk> wrote:

> Hi Nathalie,
> 
> We wouldn't recommend using the API to retrieve all of the rsIDs; there are >60million and the API is not optimised for retrieving the whole dataset in this way.
> 
> Instead I'd recommend you extract the IDs from one of our dump files; probably VCF or GVF would be the easiest to work with:
> 
> curl ftp://ftp.ensembl.org/pub/release-77/variation/vcf/homo_sapiens/Homo_sapiens.vcf.gz | zcat | grep -v # | cut -f 3 | head
> 
> (remove the head and redirect to a file to get all of them).
> 
> The somatic mutations are in a separate file, ftp://ftp.ensembl.org/pub/release-77/variation/vcf/homo_sapiens/Homo_sapiens_somatic.vcf.gz
> 
> To answer your question, to fetch somatic mutations use fetch_all_somatic() (see http://www.ensembl.org/info/docs/Doxygen/variation-api/classBio_1_1EnsEMBL_1_1Variation_1_1DBSQL_1_1VariationAdaptor.html#a22e69dacdd77542463320a1ef16b151f)
> 
> Regards
> 
> Will McLaren
> Ensembl Variation
> 
> On 7 October 2014 10:38, Nathalie Conte <nconte at ebi.ac.uk> wrote:
> Hi,
> I would like to get all variation ID  (ie rs1822893 )from ensembl, I am using the variation API to do so.
> Is it the best way? the fetch_all method seems to get all germline variation, is there another method for somatic ones?
> 
> my $vf_adaptor = Bio::EnsEMBL::Registry->get_adaptor('human', 'variation', 'variationfeature');
> my @vfs = @{$vf_adaptor->fetch_all()};
> foreach my $vf(@vfs){
>                  if ($vf){
>                         my $varID=defined($vf->variation_name) ? $vf->variation_name :'No_variation';
>                         if ($varID) {
> print  "$varID\n";
>                         }
>                 }
> }
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20141007/2bc1a792/attachment.html>


More information about the Dev mailing list