[ensembl-dev] Gene Synonyms through REST or Biomart

mag mr6 at ebi.ac.uk
Thu May 31 17:20:17 BST 2018


Hi Thomas,

Only the display_xref of a gene has synonyms, so you could speed up your 
script by limiting your query to those.

This would replace the following lines:

     my @allsyn;
     my $dber = $gene->get_all_DBEntries($dbname);
     for my $db (@$dber) {
         my $synr = $db->get_all_synonyms();
         push(@allsyn, @$synr);
     }
     ## Find unique synonyms (while maintaining order):
     my %seen;
     my @keep;
     for (0 .. $#allsyn) {
         unless ($seen{$allsyn[$_]}) {
             push(@keep, $_);
             $seen{$allsyn[$_]} = 1;
         }
     }
     print $of $geneid, "\t", join($sep, @allsyn[@keep]), "\n";

with:
    my $db = $gene->display_xref;
    print $of $gene_id, "\t", join($sep, @{$db->get_all_synonyms()});


Hope that helps,
Magali

On 25/05/2018 17:47, Thomas Danhorn wrote:
> Hi Beat,
>
> I do this regularly and have attached my Perl script -- feel free to 
> adapt it for your purposes.  It uses the Ensembl Perl API, so you have 
> to have the appropriate version installed. (I cloned the git repo and 
> get from there whichever version I need; if you run into issues, feel 
> free to ask me.)  The script takes a list of Ensembl gene IDs (with a 
> header) and prints a table with the orginal IDs and the synonyms.  Be 
> sure to specify the species with the -s option, unless you want the 
> default, 'Mouse'.
>
> One thing to note is that sometime between releases 84 and 90, the 
> location of the synonyms in the databases switched from the 
> "EntrezGene" DB to elsewhere, so if you use a newer release with the 
> default parameters will likely find no synonyms -- use the option `-b 
> all' to search through each DB (or specify one that you knnow has what 
> you need).  This takes a while (days for an entire genome annotation), 
> so if you have long lists, you may want to split them up and 
> parallelize the process.
>
> Hope this helps,
>
> Thomas
>
>
> On Tue, 22 May 2018, Premanand Achuthan wrote:
>
>> Hi Beat Wolf,
>>
>> The synonyms is not always empty. If the external source has 
>> synonyms, then it should be available, for example look at HGNC
>>
>> {"display_id": "BRCA2","primary_id": "HGNC:1101","version": 
>> "0","description": "BRCA2, DNA repair associated","dbname": 
>> "HGNC","synonyms": 
>> ["BRCC2","FACD","FAD","FAD1","FANCD","FANCD1","XRCC11"],"info_text": 
>> "Generated via ensembl_manual","info_type": 
>> "DIRECT","db_display_name": "HGNC Symbol"}
>>
>> I am afraid that at the moment you can do it only gene by gene via 
>> the REST endpoint or via the core API.
>>
>> Thanks
>> Prem
>>
>>
>> On 22/05/2018 10:23, Wolf Beat wrote:
>>> Thank you for the quick answer.
>>>
>>>
>>> I did not know about this approach, but i have two issues with it:
>>>
>>> 1) The synonyms attribute seems to be always empty, so something 
>>> seems wrong.
>>>
>>> 2) Sadly i need all synonyms for all genes for a specific species. 
>>> Downloading it through that endpoint would be too slow. Thats why i 
>>> initially looked at biomart, because all i need is a list of Ensembl 
>>> gene id + all synonyms.
>>>
>>>
>>> Kind regards
>>>
>>>
>>> Beat Wolf
>>>
>>> ________________________________
>>> From: Dev <dev-bounces at ensembl.org> on behalf of Premanand Achuthan 
>>> <prem at ebi.ac.uk>
>>> Sent: Tuesday, May 22, 2018 11:20:09 AM
>>> To: dev at ensembl.org
>>> Subject: Re: [ensembl-dev] Gene Synonyms through REST or Biomart
>>>
>>> Hi Beat Wolf,
>>>
>>> Please have a look at the /xrefs endpoint under "Cross References" and
>>> look for "synonyms" in the attribute list.
>>>
>>> http://rest.ensembl.org/xrefs/name/human/BRCA2?content-type=application/json 
>>>
>>>
>>> http://rest.ensembl.org/xrefs/id/ENSG00000157764?content-type=application/json 
>>>
>>>
>>> Hope it helps,
>>>
>>> Best Regards
>>> Prem
>>>
>>> On 22/05/2018 10:09, Wolf Beat wrote:
>>>> Hello,
>>>>
>>>>
>>>> I'm looking for a complete list of all synonyms of a gene or a set 
>>>> of genes. I can not find a way to get this information through 
>>>> biomart or the REST interface. Is there a way to do it?
>>>>
>>>>
>>>> Kind regards
>>>>
>>>>
>>>> Beat Wolf
>>>> _______________________________________________
>>>> Dev mailing list    Dev at ensembl.org
>>>> Posting guidelines and subscribe/unsubscribe info: 
>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>> Ensembl Blog: http://www.ensembl.info/
>>> _______________________________________________
>>> Dev mailing list    Dev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info: 
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>>> _______________________________________________
>>> Dev mailing list    Dev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info: 
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>
> NOTICE: This email message is for the sole use of the intended 
> recipient(s) and may contain confidential and privileged information. 
> Any unauthorized review, use, disclosure or distribution is 
> prohibited. If you are not the intended recipient, please contact the 
> sender by reply email and destroy all copies of the original message.
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20180531/752c9c5e/attachment.html>


More information about the Dev mailing list