[ensembl-dev] Changes to COSMIC data storage in ensembl API?

Stefano Giorgetti sgiorgetti at ebi.ac.uk
Mon Feb 13 16:19:03 GMT 2023


Hellp Jon,

It's quite strange, indeed!

Do you "hardcode" anything w.r.t. Ensembl release, perhaps?
(I am blundering in the dark ... not sure what's happening here)

S


On 13/02/2023 16:13, Williams, Jonathan (RTH) OUH wrote:
>
> Hello Stefano,
>
> Thanks for the speedy reply. Connecting via “ensembldb.ensembl.org”, 
> is this still correct?
>
> Will adapt code to restrict to COSMIC – the performance increase will 
> be useful. I don’t think it’s a region related issue; matching regions 
> before and after January 2023 returned COSV IDs previously and no 
> longer do now, a separate query to look up gene names using the same 
> declared slice still performs quite happily. Very strange!
>
> Jon
>
> *Jonathan Williams DClinSci PhD FRCPath
> Principal Clinical Scientist (HSS Registered) | NGS Core and Cancer 
> Genetics*
>
> *From:*Dev <dev-bounces at ensembl.org> *On Behalf Of *Stefano Giorgetti
> *Sent:* 13 February 2023 16:03
> *To:* dev at ensembl.org
> *Subject:* Re: [ensembl-dev] Changes to COSMIC data storage in ensembl 
> API?
>
> Hello Jon,
>
> The code base hasn't changed significantly transitioning from E!108 to 
> E!109, TBH.
>
> I tried (sort of) your code below, and I could get some IDs back from 
> the API.
>
> Possible differences between our code do not seem enough to explain 
> why you are not retrieving any data:
>
>   * Used "ensembldb.ensembl.org" host
>   * Focussed on region chr18 between 49481681 and 49492479
>   * returned the first 5 IDs
>
> As additional note, I would check
>
>  1. you are not connecting to our former US West mirror (which should
>     not be the case)
>  2. you may want to use
>     "$slice->get_all_somatic_VariationFeatures_by_source('COSMIC')"
>     instead of "get_all_somatic_VariationFeatures()"
>     I am getting 6-fold increase in performance ... if you are
>     interested only in COSMIC variants, of course
>
> Is it maybe region-specific issue?
> Feel free to share the regions you are interested in, should you want 
> me to double check.
>
> Hope this helps
>
> Cheers,
>
> Stefano
>
> On 13/02/2023 14:40, Williams, Jonathan (RTH) OUH wrote:
>
>     Hello All,
>
>     As part of a pipeline, I’ve been using the following code snippet
>     to grab COSMIC variant IDs from specific regions of the human
>     genome – this all worked wonderfully until the beginning of
>     February this year and I’m wondering if the code base has changed
>     with the most recent ensembl release? Any advice on how to do this
>     now, bearing in mind I am definitely a bit of a Bioinformatics
>     amateur?
>
>     my $slice_adaptor = $registry->get_adaptor('Human', 'Core', 'Slice');
>
>     my $slice = $slice_adaptor->fetch_by_region($coords);
>
>     my $vs = $slice->get_all_somatic_VariationFeatures();
>
>     my @variants = @{$vs};
>
>     foreach (@variants){
>
>                     my $name = $_->name();
>
>                     push (@variant_list, $name);
>
>     };
>
>     When this was working the end result was pushing of COSV IDs into
>     the final @variant_list array which I was then able to print
>     against regions in a final report. This array now ends up empty so
>     either the “->name()” parameter or something to do with COSMIC
>     data storage amongst the “get_all_somatic_VariationFeatures”
>     dataset may have been altered…
>
>     Jon Williams
>
>
>
>     _______________________________________________
>
>     Dev mailing listDev at ensembl.org
>
>     Posting guidelines and subscribe/unsubscribe info:https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org  <https://gbr01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.ensembl.org%2Fmailman%2Flistinfo%2Fdev_ensembl.org&data=05%7C01%7Cjonathan.williams2%40ouh.nhs.uk%7Cef916cdfcd2f4b39368508db0ddc3509%7C25d273c3a8514cfba239e9048f989669%7C0%7C0%7C638119011650748436%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=AziKqs%2Ba8t2TrNAplFX9FuV8t34tbtEUmwipMUDm5tk%3D&reserved=0>
>
>     Ensembl Blog:http://www.ensembl.info/  <https://gbr01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.ensembl.info%2F&data=05%7C01%7Cjonathan.williams2%40ouh.nhs.uk%7Cef916cdfcd2f4b39368508db0ddc3509%7C25d273c3a8514cfba239e9048f989669%7C0%7C0%7C638119011650748436%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=6OrFXYiGQwyJ5UKG5dzxXWGrnAYOrlLA1FS8R9m65Ds%3D&reserved=0>
>
>
> _______________________________________________
> Dev mailing listDev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org
> Ensembl Blog:http://www.ensembl.info/

-- 
—
Stefano Giorgetti
Ensembl Infrastructure Team Leader, EMBL-EBI
sgiorgetti at ebi.ac.uk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20230213/c7410e0d/attachment-0001.html>


More information about the Dev mailing list