[ensembl-dev] perl API: how to avoid deprecated identifiers?

Michael Yourshaw myourshaw at ucla.edu
Mon Nov 3 16:45:26 GMT 2014


Thanks, Andy.

I was using version 75 when I did the check. For now we are stuck on GRCh37 for humans, so we have been keeping everything at v75.

The mouse data was created at another lab, and I’m not sure what version they used.

ॐ

Michael Yourshaw, PhD
UCLA Geffen School of Medicine
Department of Pediatrics
695 Charles E Young Drive S
Gonda 5554
Los Angeles CA 90095-8348 USA
myourshaw at ucla.edu <mailto:myourshaw at ucla.edu>
970.691.8299

This message (including any attachments) is intended only for the use of the addressee(s) and may contain information that is PRIVILEGED and CONFIDENTIAL, and/or may constitute ATTORNEY WORK PRODUCT. If you are not an intended recipient, you are hereby notified that any dissemination of this communication is strictly prohibited. If you have received this message in error, please do not read, copy, or forward this message or any attachments. Please permanently delete all copies of the message and any attachments and notify the sender immediately by sending an email to myourshaw at yourshaw.org. Thank you. As part of our commitment to the environment, this message was manufactured with 100% recycled electrons.





> On 3Nov, 2014, at 04:32, Andy Yates <ayates at ebi.ac.uk> wrote:
> 
> Hi there,
> 
> Sorry that your email got lost. Anyway we're on it now. With respect to your 1st problem I've been unable to replicate the issue. I took your list of IDs, converted it into an IN statement and queried the 77 mouse core database and was unable to retrieve the whole list. I went from 176 identifiers to 115 identifiers. What version of the API are you using & which database are you connecting to?
> 
> Even after that though there are 25 MGI symbols attached to multiple Ensembl identifiers from your list. We're having a look into some of those cases now and should be back in touch soon.
> 
> Andy
> 
> ------------
> Andrew Yates - Ensembl Support Coordinator
> European Molecular Biology Laboratory
> European Bioinformatics Institute
> Wellcome Trust Genome Campus
> Hinxton, Cambridge
> CB10 1SD, United Kingdom
> Tel: +44-(0)1223-492538
> Fax: +44-(0)1223-494468
> Skype: andrewyatz
> http://www.ensembl.org/
> 
> On 17 Oct 2014, at 19:23, Michael Yourshaw <myourshaw at g.ucla.edu> wrote:
> 
>> At least with regard to mouse, GeneAdapter ->fetch_all() returns some genes with deprecated identifiers. For example, the mouse Acer2 gene has both ENSMUSG00000038007 and ENSMUSG00000091609.
>> 
>> Ensembl gene ENSMUSG00000091609 is no longer in the database but it has been mapped to 1 deprecated identifier . Not a Primary Assembly Gene.
>> 
>> Both the current and the deprecated genes have an is_current value of 1.
>> 
>> Although I have not checked them all manually, there appear to be 85 mouse genes with one or two such deprecated stable ids.
>> 
>> Is there a perl API way to fetch all genes and get only non-deprecated stable ids, or a method to detect and avoid them after fetching?
>> 
>> List of multiple Ensembl mouse stable ids associated with a single MGI gene symbol.
>> 
>> ENSMUSG00000094789	1700040F15Rik
>> ENSMUSG00000095141	1700040F15Rik
>> ENSMUSG00000054165	4922502B01Rik
>> ENSMUSG00000057387	4922502B01Rik
>> ENSMUSG00000050883	4930523C07Rik
>> ENSMUSG00000090394	4930523C07Rik
>> ENSMUSG00000032985	5730522E02Rik
>> ENSMUSG00000073101	5730522E02Rik
>> ENSMUSG00000057715	A830018L16Rik
>> ENSMUSG00000095719	A830018L16Rik
>> ENSMUSG00000038007	Acer2
>> ENSMUSG00000091609	Acer2
>> ENSMUSG00000041748	Ackr4
>> ENSMUSG00000079355	Ackr4
>> ENSMUSG00000000562	Adora3
>> ENSMUSG00000074344	Adora3
>> ENSMUSG00000047383	Als2cr11
>> ENSMUSG00000072295	Als2cr11
>> ENSMUSG00000031731	Ap1g1
>> ENSMUSG00000096262	Ap1g1
>> ENSMUSG00000052414	Atf7
>> ENSMUSG00000071584	Atf7
>> ENSMUSG00000030213	Atf7ip
>> ENSMUSG00000053935	Atf7ip
>> ENSMUSG00000055936	AU015836
>> ENSMUSG00000081044	AU015836
>> ENSMUSG00000029673	Auts2
>> ENSMUSG00000098133	Auts2
>> ENSMUSG00000036948	BC037034
>> ENSMUSG00000091964	BC037034
>> ENSMUSG00000079537	C030048H21Rik
>> ENSMUSG00000090340	C030048H21Rik
>> ENSMUSG00000094121	Ccl21c
>> ENSMUSG00000096271	Ccl21c
>> ENSMUSG00000096873	Ccl21c
>> ENSMUSG00000023235	Ccl25
>> ENSMUSG00000055951	Ccl25
>> ENSMUSG00000026361	Cdc73
>> ENSMUSG00000078284	Cdc73
>> ENSMUSG00000026616	Cr2
>> ENSMUSG00000094924	Cr2
>> ENSMUSG00000022150	Dab2
>> ENSMUSG00000079102	Dab2
>> ENSMUSG00000048915	Efna5
>> ENSMUSG00000090425	Efna5
>> ENSMUSG00000048910	Fam220a
>> ENSMUSG00000083012	Fam220a
>> ENSMUSG00000069808	Fam57a
>> ENSMUSG00000096115	Fam57a
>> ENSMUSG00000051379	Flrt3
>> ENSMUSG00000079021	Flrt3
>> ENSMUSG00000070733	Fryl
>> ENSMUSG00000090491	Fryl
>> ENSMUSG00000061864	Galntl6
>> ENSMUSG00000096914	Galntl6
>> ENSMUSG00000092021	Gbp11
>> ENSMUSG00000098049	Gbp11
>> ENSMUSG00000052942	Glis3
>> ENSMUSG00000091294	Glis3
>> ENSMUSG00000095611	Gm10597
>> ENSMUSG00000096892	Gm10597
>> ENSMUSG00000091594	Gm17067
>> ENSMUSG00000095144	Gm17067
>> ENSMUSG00000072917	Gm1965
>> ENSMUSG00000090254	Gm1965
>> ENSMUSG00000074812	Gm355
>> ENSMUSG00000096886	Gm355
>> ENSMUSG00000090897	Gm5494
>> ENSMUSG00000092043	Gm5494
>> ENSMUSG00000091779	Gm6763
>> ENSMUSG00000097427	Gm6763
>> ENSMUSG00000094474	Gm7792
>> ENSMUSG00000094722	Gm7792
>> ENSMUSG00000095523	Gm7792
>> ENSMUSG00000050347	Gm9844
>> ENSMUSG00000091955	Gm9844
>> ENSMUSG00000034243	Golgb1
>> ENSMUSG00000078096	Golgb1
>> ENSMUSG00000041907	Gpr45
>> ENSMUSG00000096364	Gpr45
>> ENSMUSG00000026313	Hdac4
>> ENSMUSG00000073617	Hdac4
>> ENSMUSG00000028634	Hivep3
>> ENSMUSG00000078582	Hivep3
>> ENSMUSG00000051396	Hspa14
>> ENSMUSG00000079615	Hspa14
>> ENSMUSG00000090498	Kcnb2
>> ENSMUSG00000092083	Kcnb2
>> ENSMUSG00000025762	Larp1b
>> ENSMUSG00000037814	Larp1b
>> ENSMUSG00000004613	Lim2
>> ENSMUSG00000093639	Lim2
>> ENSMUSG00000097437	Lim2
>> ENSMUSG00000040003	Magi2
>> ENSMUSG00000067798	Magi2
>> ENSMUSG00000073174	Magi2
>> ENSMUSG00000014426	Map3k4
>> ENSMUSG00000079716	Map3k4
>> ENSMUSG00000034912	Mdga2
>> ENSMUSG00000079510	Mdga2
>> ENSMUSG00000003178	Mical3
>> ENSMUSG00000051586	Mical3
>> ENSMUSG00000042570	Mier2
>> ENSMUSG00000091854	Mier2
>> ENSMUSG00000031200	Mtcp1
>> ENSMUSG00000090110	Mtcp1
>> ENSMUSG00000025515	Muc2
>> ENSMUSG00000094393	Muc2
>> ENSMUSG00000095400	Muc2
>> ENSMUSG00000009418	Nav1
>> ENSMUSG00000090399	Nav1
>> ENSMUSG00000069670	Nkain2
>> ENSMUSG00000069671	Nkain2
>> ENSMUSG00000028706	Nsun4
>> ENSMUSG00000090697	Nsun4
>> ENSMUSG00000050836	Ntng1
>> ENSMUSG00000059857	Ntng1
>> ENSMUSG00000023826	Park2
>> ENSMUSG00000073465	Park2
>> ENSMUSG00000095795	Park2
>> ENSMUSG00000021699	Pde4d
>> ENSMUSG00000074661	Pde4d
>> ENSMUSG00000032203	Pigb
>> ENSMUSG00000079469	Pigb
>> ENSMUSG00000030228	Pik3c2g
>> ENSMUSG00000096062	Pik3c2g
>> ENSMUSG00000044407	Qk
>> ENSMUSG00000062078	Qk
>> ENSMUSG00000039717	Ralyl
>> ENSMUSG00000096025	Ralyl
>> ENSMUSG00000030259	Rassf8
>> ENSMUSG00000045110	Rassf8
>> ENSMUSG00000045365	Rbm15b
>> ENSMUSG00000074102	Rbm15b
>> ENSMUSG00000023156	Rpp14
>> ENSMUSG00000094130	Rpp14
>> ENSMUSG00000092572	Serpinb10
>> ENSMUSG00000098034	Serpinb10
>> ENSMUSG00000021852	Slc35f4
>> ENSMUSG00000079246	Slc35f4
>> ENSMUSG00000053877	Srcap
>> ENSMUSG00000090663	Srcap
>> ENSMUSG00000027751	Supt20
>> ENSMUSG00000095832	Supt20
>> ENSMUSG00000019769	Syne1
>> ENSMUSG00000096054	Syne1
>> ENSMUSG00000052293	Taf9
>> ENSMUSG00000078941	Taf9
>> ENSMUSG00000079733	Tmem181b-ps
>> ENSMUSG00000096780	Tmem181b-ps
>> ENSMUSG00000041353	Tmem29
>> ENSMUSG00000090483	Tmem29
>> ENSMUSG00000062210	Tnfaip8
>> ENSMUSG00000094040	Tnfaip8
>> ENSMUSG00000010751	Tnfrsf22
>> ENSMUSG00000090852	Tnfrsf22
>> ENSMUSG00000048546	Tob2
>> ENSMUSG00000078960	Tob2
>> ENSMUSG00000021711	Trappc13
>> ENSMUSG00000078936	Trappc13
>> ENSMUSG00000052749	Trim30b
>> ENSMUSG00000091576	Trim30b
>> ENSMUSG00000026558	Uck2
>> ENSMUSG00000053664	Uck2
>> ENSMUSG00000020220	Vps13d
>> ENSMUSG00000073719	Vps13d
>> ENSMUSG00000026115	Vwa3b
>> ENSMUSG00000050122	Vwa3b
>> ENSMUSG00000039951	Wfdc3
>> ENSMUSG00000076434	Wfdc3
>> ENSMUSG00000022708	Zbtb20
>> ENSMUSG00000036279	Zbtb20
>> ENSMUSG00000046556	Zfp319
>> ENSMUSG00000074140	Zfp319
>> ENSMUSG00000074608	Zfp850
>> ENSMUSG00000096916	Zfp850
>> 
>> 
>>>> 
>> 
>> Michael Yourshaw, PhD
>> UCLA Geffen School of Medicine
>> Department of Pediatrics
>> 695 Charles E Young Drive S
>> Gonda 5554
>> Los Angeles CA 90095-8348 USA
>> myourshaw at ucla.edu
>> 970.691.8299
>> 
>> This message (including any attachments) is intended only for the use of the addressee(s) and may contain information that is PRIVILEGED and CONFIDENTIAL, and/or may constitute ATTORNEY WORK PRODUCT. If you are not an intended recipient, you are hereby notified that any dissemination of this communication is strictly prohibited. If you have received this message in error, please do not read, copy, or forward this message or any attachments. Please permanently delete all copies of the message and any attachments and notify the sender immediately by sending an email to myourshaw at yourshaw.org. Thank you. As part of our commitment to the environment, this message was manufactured with 100% recycled electrons.
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20141103/1bc34b0e/attachment.html>


More information about the Dev mailing list