[ensembl-dev] perl API: how to avoid deprecated identifiers?
Michael Yourshaw
myourshaw at ucla.edu
Mon Nov 3 16:45:26 GMT 2014
Thanks, Andy.
I was using version 75 when I did the check. For now we are stuck on GRCh37 for humans, so we have been keeping everything at v75.
The mouse data was created at another lab, and I’m not sure what version they used.
ॐ
Michael Yourshaw, PhD
UCLA Geffen School of Medicine
Department of Pediatrics
695 Charles E Young Drive S
Gonda 5554
Los Angeles CA 90095-8348 USA
myourshaw at ucla.edu <mailto:myourshaw at ucla.edu>
970.691.8299
This message (including any attachments) is intended only for the use of the addressee(s) and may contain information that is PRIVILEGED and CONFIDENTIAL, and/or may constitute ATTORNEY WORK PRODUCT. If you are not an intended recipient, you are hereby notified that any dissemination of this communication is strictly prohibited. If you have received this message in error, please do not read, copy, or forward this message or any attachments. Please permanently delete all copies of the message and any attachments and notify the sender immediately by sending an email to myourshaw at yourshaw.org. Thank you. As part of our commitment to the environment, this message was manufactured with 100% recycled electrons.
> On 3Nov, 2014, at 04:32, Andy Yates <ayates at ebi.ac.uk> wrote:
>
> Hi there,
>
> Sorry that your email got lost. Anyway we're on it now. With respect to your 1st problem I've been unable to replicate the issue. I took your list of IDs, converted it into an IN statement and queried the 77 mouse core database and was unable to retrieve the whole list. I went from 176 identifiers to 115 identifiers. What version of the API are you using & which database are you connecting to?
>
> Even after that though there are 25 MGI symbols attached to multiple Ensembl identifiers from your list. We're having a look into some of those cases now and should be back in touch soon.
>
> Andy
>
> ------------
> Andrew Yates - Ensembl Support Coordinator
> European Molecular Biology Laboratory
> European Bioinformatics Institute
> Wellcome Trust Genome Campus
> Hinxton, Cambridge
> CB10 1SD, United Kingdom
> Tel: +44-(0)1223-492538
> Fax: +44-(0)1223-494468
> Skype: andrewyatz
> http://www.ensembl.org/
>
> On 17 Oct 2014, at 19:23, Michael Yourshaw <myourshaw at g.ucla.edu> wrote:
>
>> At least with regard to mouse, GeneAdapter ->fetch_all() returns some genes with deprecated identifiers. For example, the mouse Acer2 gene has both ENSMUSG00000038007 and ENSMUSG00000091609.
>>
>> Ensembl gene ENSMUSG00000091609 is no longer in the database but it has been mapped to 1 deprecated identifier . Not a Primary Assembly Gene.
>>
>> Both the current and the deprecated genes have an is_current value of 1.
>>
>> Although I have not checked them all manually, there appear to be 85 mouse genes with one or two such deprecated stable ids.
>>
>> Is there a perl API way to fetch all genes and get only non-deprecated stable ids, or a method to detect and avoid them after fetching?
>>
>> List of multiple Ensembl mouse stable ids associated with a single MGI gene symbol.
>>
>> ENSMUSG00000094789 1700040F15Rik
>> ENSMUSG00000095141 1700040F15Rik
>> ENSMUSG00000054165 4922502B01Rik
>> ENSMUSG00000057387 4922502B01Rik
>> ENSMUSG00000050883 4930523C07Rik
>> ENSMUSG00000090394 4930523C07Rik
>> ENSMUSG00000032985 5730522E02Rik
>> ENSMUSG00000073101 5730522E02Rik
>> ENSMUSG00000057715 A830018L16Rik
>> ENSMUSG00000095719 A830018L16Rik
>> ENSMUSG00000038007 Acer2
>> ENSMUSG00000091609 Acer2
>> ENSMUSG00000041748 Ackr4
>> ENSMUSG00000079355 Ackr4
>> ENSMUSG00000000562 Adora3
>> ENSMUSG00000074344 Adora3
>> ENSMUSG00000047383 Als2cr11
>> ENSMUSG00000072295 Als2cr11
>> ENSMUSG00000031731 Ap1g1
>> ENSMUSG00000096262 Ap1g1
>> ENSMUSG00000052414 Atf7
>> ENSMUSG00000071584 Atf7
>> ENSMUSG00000030213 Atf7ip
>> ENSMUSG00000053935 Atf7ip
>> ENSMUSG00000055936 AU015836
>> ENSMUSG00000081044 AU015836
>> ENSMUSG00000029673 Auts2
>> ENSMUSG00000098133 Auts2
>> ENSMUSG00000036948 BC037034
>> ENSMUSG00000091964 BC037034
>> ENSMUSG00000079537 C030048H21Rik
>> ENSMUSG00000090340 C030048H21Rik
>> ENSMUSG00000094121 Ccl21c
>> ENSMUSG00000096271 Ccl21c
>> ENSMUSG00000096873 Ccl21c
>> ENSMUSG00000023235 Ccl25
>> ENSMUSG00000055951 Ccl25
>> ENSMUSG00000026361 Cdc73
>> ENSMUSG00000078284 Cdc73
>> ENSMUSG00000026616 Cr2
>> ENSMUSG00000094924 Cr2
>> ENSMUSG00000022150 Dab2
>> ENSMUSG00000079102 Dab2
>> ENSMUSG00000048915 Efna5
>> ENSMUSG00000090425 Efna5
>> ENSMUSG00000048910 Fam220a
>> ENSMUSG00000083012 Fam220a
>> ENSMUSG00000069808 Fam57a
>> ENSMUSG00000096115 Fam57a
>> ENSMUSG00000051379 Flrt3
>> ENSMUSG00000079021 Flrt3
>> ENSMUSG00000070733 Fryl
>> ENSMUSG00000090491 Fryl
>> ENSMUSG00000061864 Galntl6
>> ENSMUSG00000096914 Galntl6
>> ENSMUSG00000092021 Gbp11
>> ENSMUSG00000098049 Gbp11
>> ENSMUSG00000052942 Glis3
>> ENSMUSG00000091294 Glis3
>> ENSMUSG00000095611 Gm10597
>> ENSMUSG00000096892 Gm10597
>> ENSMUSG00000091594 Gm17067
>> ENSMUSG00000095144 Gm17067
>> ENSMUSG00000072917 Gm1965
>> ENSMUSG00000090254 Gm1965
>> ENSMUSG00000074812 Gm355
>> ENSMUSG00000096886 Gm355
>> ENSMUSG00000090897 Gm5494
>> ENSMUSG00000092043 Gm5494
>> ENSMUSG00000091779 Gm6763
>> ENSMUSG00000097427 Gm6763
>> ENSMUSG00000094474 Gm7792
>> ENSMUSG00000094722 Gm7792
>> ENSMUSG00000095523 Gm7792
>> ENSMUSG00000050347 Gm9844
>> ENSMUSG00000091955 Gm9844
>> ENSMUSG00000034243 Golgb1
>> ENSMUSG00000078096 Golgb1
>> ENSMUSG00000041907 Gpr45
>> ENSMUSG00000096364 Gpr45
>> ENSMUSG00000026313 Hdac4
>> ENSMUSG00000073617 Hdac4
>> ENSMUSG00000028634 Hivep3
>> ENSMUSG00000078582 Hivep3
>> ENSMUSG00000051396 Hspa14
>> ENSMUSG00000079615 Hspa14
>> ENSMUSG00000090498 Kcnb2
>> ENSMUSG00000092083 Kcnb2
>> ENSMUSG00000025762 Larp1b
>> ENSMUSG00000037814 Larp1b
>> ENSMUSG00000004613 Lim2
>> ENSMUSG00000093639 Lim2
>> ENSMUSG00000097437 Lim2
>> ENSMUSG00000040003 Magi2
>> ENSMUSG00000067798 Magi2
>> ENSMUSG00000073174 Magi2
>> ENSMUSG00000014426 Map3k4
>> ENSMUSG00000079716 Map3k4
>> ENSMUSG00000034912 Mdga2
>> ENSMUSG00000079510 Mdga2
>> ENSMUSG00000003178 Mical3
>> ENSMUSG00000051586 Mical3
>> ENSMUSG00000042570 Mier2
>> ENSMUSG00000091854 Mier2
>> ENSMUSG00000031200 Mtcp1
>> ENSMUSG00000090110 Mtcp1
>> ENSMUSG00000025515 Muc2
>> ENSMUSG00000094393 Muc2
>> ENSMUSG00000095400 Muc2
>> ENSMUSG00000009418 Nav1
>> ENSMUSG00000090399 Nav1
>> ENSMUSG00000069670 Nkain2
>> ENSMUSG00000069671 Nkain2
>> ENSMUSG00000028706 Nsun4
>> ENSMUSG00000090697 Nsun4
>> ENSMUSG00000050836 Ntng1
>> ENSMUSG00000059857 Ntng1
>> ENSMUSG00000023826 Park2
>> ENSMUSG00000073465 Park2
>> ENSMUSG00000095795 Park2
>> ENSMUSG00000021699 Pde4d
>> ENSMUSG00000074661 Pde4d
>> ENSMUSG00000032203 Pigb
>> ENSMUSG00000079469 Pigb
>> ENSMUSG00000030228 Pik3c2g
>> ENSMUSG00000096062 Pik3c2g
>> ENSMUSG00000044407 Qk
>> ENSMUSG00000062078 Qk
>> ENSMUSG00000039717 Ralyl
>> ENSMUSG00000096025 Ralyl
>> ENSMUSG00000030259 Rassf8
>> ENSMUSG00000045110 Rassf8
>> ENSMUSG00000045365 Rbm15b
>> ENSMUSG00000074102 Rbm15b
>> ENSMUSG00000023156 Rpp14
>> ENSMUSG00000094130 Rpp14
>> ENSMUSG00000092572 Serpinb10
>> ENSMUSG00000098034 Serpinb10
>> ENSMUSG00000021852 Slc35f4
>> ENSMUSG00000079246 Slc35f4
>> ENSMUSG00000053877 Srcap
>> ENSMUSG00000090663 Srcap
>> ENSMUSG00000027751 Supt20
>> ENSMUSG00000095832 Supt20
>> ENSMUSG00000019769 Syne1
>> ENSMUSG00000096054 Syne1
>> ENSMUSG00000052293 Taf9
>> ENSMUSG00000078941 Taf9
>> ENSMUSG00000079733 Tmem181b-ps
>> ENSMUSG00000096780 Tmem181b-ps
>> ENSMUSG00000041353 Tmem29
>> ENSMUSG00000090483 Tmem29
>> ENSMUSG00000062210 Tnfaip8
>> ENSMUSG00000094040 Tnfaip8
>> ENSMUSG00000010751 Tnfrsf22
>> ENSMUSG00000090852 Tnfrsf22
>> ENSMUSG00000048546 Tob2
>> ENSMUSG00000078960 Tob2
>> ENSMUSG00000021711 Trappc13
>> ENSMUSG00000078936 Trappc13
>> ENSMUSG00000052749 Trim30b
>> ENSMUSG00000091576 Trim30b
>> ENSMUSG00000026558 Uck2
>> ENSMUSG00000053664 Uck2
>> ENSMUSG00000020220 Vps13d
>> ENSMUSG00000073719 Vps13d
>> ENSMUSG00000026115 Vwa3b
>> ENSMUSG00000050122 Vwa3b
>> ENSMUSG00000039951 Wfdc3
>> ENSMUSG00000076434 Wfdc3
>> ENSMUSG00000022708 Zbtb20
>> ENSMUSG00000036279 Zbtb20
>> ENSMUSG00000046556 Zfp319
>> ENSMUSG00000074140 Zfp319
>> ENSMUSG00000074608 Zfp850
>> ENSMUSG00000096916 Zfp850
>>
>>
>> ॐ
>>
>>
>> Michael Yourshaw, PhD
>> UCLA Geffen School of Medicine
>> Department of Pediatrics
>> 695 Charles E Young Drive S
>> Gonda 5554
>> Los Angeles CA 90095-8348 USA
>> myourshaw at ucla.edu
>> 970.691.8299
>>
>> This message (including any attachments) is intended only for the use of the addressee(s) and may contain information that is PRIVILEGED and CONFIDENTIAL, and/or may constitute ATTORNEY WORK PRODUCT. If you are not an intended recipient, you are hereby notified that any dissemination of this communication is strictly prohibited. If you have received this message in error, please do not read, copy, or forward this message or any attachments. Please permanently delete all copies of the message and any attachments and notify the sender immediately by sending an email to myourshaw at yourshaw.org. Thank you. As part of our commitment to the environment, this message was manufactured with 100% recycled electrons.
>>
>>
>>
>>
>>
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20141103/1bc34b0e/attachment.html>
More information about the Dev
mailing list