[ensembl-dev] how to annotate many gene names from ensemble to gene name

Mohammad Goodarzi mohammad.godarzi at gmail.com
Tue Jan 23 20:07:08 GMT 2018


Hi Kieron,

I have mapped the ensemble ID to gene names from 2012 to 2017. There are
quite some variations across these years
some ID are missing too. Can you please have a look at attached and tell me
which one should I use?

Thanks
Mohammad

On Mon, Jan 22, 2018 at 10:49 AM, Kieron Taylor <ktaylor at ebi.ac.uk> wrote:

> Hi Mohammad,
>
> Your request is quite non-specific, and that makes it difficult for me to
> know how to help.
>
> If you know when your data was created, you can pick an archive and use
> the BioMart data from that release to fetch the gene names, just as you
> intended intiially.
>
> http://www.ensembl.org/info/website/archives/index.html
>
> I would suggest somewhere around release 81, given the sample IDs you
> provided. You would then get the gene names we assigned to those loci at
> that point in time. For the IDs not to be present in our latest data
> suggests a more recent revision of the underlying sequence has occurred,
> and you should expect those gene names to have changed somewhat.
>
> The scientific purpose of your work will then help you decide whether that
> output is useful or not. It may prove necessary for you to find which
> Ensembl IDs in our current release are closest to those retired Ensembl IDs
> you are working on. For this you can copy your list of IDs into the ID
> history tool (http://www.ensembl.org/Homo_sapiens/Tools/IDMapper?db=core)
> or send them one by one to our REST API (rest.ensembl.org). Our Perl API
> can also achieve the same result if Perl suits you (
> http://www.ensembl.org/info/docs/Doxygen/core-api/
> classBio_1_1EnsEMBL_1_1DBSQL_1_1ArchiveStableIdAdaptor.html)
>
> In my opinion, the most straightforward programming approach is to write a
> script that consumes your list and sends requests to our REST API archive
> endpoint. You can consult the training materials from our REST API course
> to get yourself started, but the archive endpoint is not used explicitly:
>
> http://training.ensembl.org/events/2017/2017-11-27-REST_API_EBI_Nov
>
> If you wish to understand why we retire stable IDs, you can consult our
> documentation on the topic: http://www.ensembl.org/info/
> genome/stable_ids/index.html
>
> I hope that is sufficient to get you going.
>
> Regards,
>
> Kieron
>
>
> Kieron Taylor PhD.
> Ensembl Developer
>
> EMBL, European Bioinformatics Institute
>
>
> > On 22 Jan 2018, at 13:55, Mohammad Goodarzi <mohammad.godarzi at gmail.com>
> wrote:
> >
> > Hello,
> >
> > Thank you for your reply.
> > Is it possible to guide me how to use one of your archive with Biomart
> or any other programming language ?
> > When it comes to 3000 genes , it is very difficult to do them one by one
> .
> >
> > Thanks
> > Mohammad
> >
> > On Mon, 22 Jan 2018 at 04:21, Kieron Taylor <ktaylor at ebi.ac.uk> wrote:
> > Hi Mohammad.
> >
> > It looks like most of your IDs are now retired. If we take your first
> example:
> >
> > http://www.ensembl.org/Homo_sapiens/Gene/Idhistory?g=ENSG00000122718
> >
> > Our Gene page for this ID reports that it was retired in release 84. A
> revision of the sequence there, or validation of our genebuild has caused
> us to retire the ID as no longer meaning what we thought it did.
> >
> > The BioMart service has no way to report data that is not current.
> Depending on your needs, you could use one of our archive servers to get
> the data surrounding your IDs, for example: [1]
> >
> > The easiest thing might be to feed your list of failed IDs into our ID
> history tool [2]. This will tell you the last Ensembl release in which that
> ID was seen, and if possible report the ID that replaced it.
> >
> > Another alternative is to cross-check your IDs to see which ones have
> been retired against our REST archive endpoint: [3]
> >
> > Hopefully one of these methods will suit your needs.
> >
> >
> > Regards,
> >
> > Kieron
> >
> > [1] - http://mar2016.archive.ensembl.org/biomart/martview/
> 3459e207de70960baa9be743908900d2
> > [2] - http://www.ensembl.org/Homo_sapiens/Tools/IDMapper?db=core
> > [3] - http://rest.ensembl.org/archive/id/ENSG00000122718?
> content-type=application/json
> >
> >
> >
> > Kieron Taylor PhD.
> > Ensembl Developer
> >
> > EMBL, European Bioinformatics Institute
> >
> >
> >
> >
> >
> >
> > > On 21 Jan 2018, at 20:53, Mohammad Goodarzi <
> mohammad.godarzi at gmail.com> wrote:
> > >
> > > hello,
> > >
> > > I recently try to annotate a set of gene names. I have over 3000 genes
> that i cannot annotate using biomart for example.
> > >
> > > I have been searching a lot but I could not find a solution. can you
> please comment how would you do this ? I post a small set of them for the
> practice purpose and I am happy to get any opinion that help me to do this
> automatically.
> > >
> > > Please see below
> > >
> > > Thanks
> > >
> > > ENSG00000122718
> > > ENSG00000130201
> > > ENSG00000150076
> > > ENSG00000150526
> > > ENSG00000155640
> > > ENSG00000166748
> > > ENSG00000168260
> > > ENSG00000168787
> > > ENSG00000170590
> > > ENSG00000170803
> > > ENSG00000171484
> > > ENSG00000172381
> > > ENSG00000172774
> > >
> > > _______________________________________________
> > > Dev mailing list    Dev at ensembl.org
> > > Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> > > Ensembl Blog: http://www.ensembl.info/
> >
> > _______________________________________________
> > Dev mailing list    Dev at ensembl.org
> > Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> > Ensembl Blog: http://www.ensembl.info/
> > _______________________________________________
> > Dev mailing list    Dev at ensembl.org
> > Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> > Ensembl Blog: http://www.ensembl.info/
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20180123/df8df264/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Genenames.xlsx
Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
Size: 4632679 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20180123/df8df264/attachment.xlsx>


More information about the Dev mailing list