[ensembl-dev] Rest API lookup

Kieron Taylor ktaylor at ebi.ac.uk
Tue Nov 29 10:19:36 GMT 2016


Hi David,

The lookup endpoint can fail over to "long lookup" if no lookup DB is available. In your situation it must be overlooking the database. Perhaps your trinomial database name is tripping up a regular expression in our registry? As an experiment you could try renaming it. We do support trinomial names in other contexts, but our database naming convention hasn't been changed for a long time.


Kieron

Kieron Taylor PhD.
Ensembl Developer

EMBL, European Bioinformatics Institute






> On 28 Nov 2016, at 15:58, Herzig, David <david.herzig at roche.com> wrote:
> 
> 
> I have a mysql database with the following databases:
> <image.png>
> All databases are loaded from the ensembl page (mysql dumps), expect SUS_SCROFA_DOMESTICUS_CORE_85_2012.
> 
> This database is for our custom specie (https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=9825). I set up the schema, loaded the sequence data and the genes by using the ensembl provided perl scripts.
> 
> The rest API is connected to the all version 85 databases and will be displayed in the browser (e.g. information for all species).
> 
> I understand the concept of having a lookup database. But at the moment I do not have such a database and I am wondering how it works at the moment, as I cannot find any gene/transcript information over the lookup part. (I have also added genes from NCBI and they can also be found over the lookup (without changes anything)).
> 
> 
> ---------- Forwarded message ----------
> From: Kieron Taylor <ktaylor at ebi.ac.uk>
> Date: Fri, Nov 25, 2016 at 12:57 PM
> Subject: Re: [ensembl-dev] Rest API lookup
> To: Ensembl developers list <dev at ensembl.org>
> 
> 
> Hi David,
> 
> The ensembl_stable_id database has to be populated with respect to your available databases. We do provide MySQL dumps of our stable ID database, but that includes all Ensembl species and would lead to errors if you searched for IDs from databases you do not have:
> 
> ftp://ftp.ensembl.org/pub/release-86/mysql/ensembl_stable_ids_86/
> 
> The database is a multi-species one and thus is not part of the species you have already downloaded. The ensembl-core API bundle contains the ability to create one for yourself, see again the misc-scripts/stable_id_lookup/ I mentioned previously. That should be able to create and populate it for you.
> 
> If you want to only have the slow exhaustive search, you could try to force the issue with a patch. It might help us diagnose what is going on.
> 
>  diff --git a/lib/EnsEMBL/REST/Model/Lookup.pm b/lib/EnsEMBL/REST/Model/Lookup.pm
>   index 94e8266..ead0440 100755
> 
> --- a/lib/EnsEMBL/REST/Model/Lookup.pm
> +++ b/lib/EnsEMBL/REST/Model/Lookup.pm
> 
>   @@ -29,7 +29,7 @@ extends 'Catalyst::Model';
>    with 'Catalyst::Component::InstancePerContext';
> 
>    # Config
>   -has 'lookup_model' => ( is => 'ro', isa => 'Str', required => 1, default => 'DatabaseIDLookup' );
>   +has 'lookup_model' => ( is => 'ro', isa => 'Str', required => 1, default => 'LongDatabaseIDLookup' );
> 
>    # Per instance variables
>    has 'context' => (is => 'ro', weak_ref => 1);
> 
> 
> That's all I can think of without seeing the way you have your server set up.
> 
> Regards,
> 
> Kieron
> 
> 
> Kieron Taylor PhD.
> Ensembl Developer
> 
> EMBL, European Bioinformatics Institute
> 
> 
> 
> 
> 
> 
> > On 25 Nov 2016, at 11:06, Herzig, David <david.herzig at roche.com> wrote:
> >
> > Hi Kieron
> >
> > Thx for your feedback.
> >
> > As I have only downloaded the mysql dumps for several species, where is the lookup db in my case?
> >
> > Is there a easy way to change the lookup in a way that all db's/tables will be searched?
> >
> > regards,
> > David
> >
> > On Wed, Nov 23, 2016 at 3:00 PM, Kieron Taylor <ktaylor at ebi.ac.uk> wrote:
> > Hi David,
> >
> > The lookup/id endpoint has no species argument to know in which database to look. Our REST servers have access to a database called ensembl_stable_ids which contains all the gene, transcript etc. ids and what species they belong to, which the lookup endpoint searches. Your ID is probably not in this "lookup" database.
> >
> > The script we use to populate that database is found here: https://github.com/Ensembl/ensembl/tree/master/misc-scripts/stable_id_lookup
> > I don't know how well it will work outside of our production environment, but we can try to help.
> >
> > The lookup mechanism can be made to search all tables of all databases to find your ID, but this can be terribly slow. If you try the overlap endpoint and specify the coordinates of stable_id 123456, you should be able to see your feature as normal, because you will have had to specify the feature in the URL.
> >
> >
> >
> > Regards,
> >
> > Kieron
> >
> >
> > > On 23 Nov 2016, at 12:38, Herzig, David <david.herzig at roche.com> wrote:
> > >
> > > Hi Ensembl Team
> > >
> > > I set up the ensembl database with several species. I also set up the REST API. Everything works fine.
> > >
> > > Now I have loaded custom species files into a new database (created the schema, loaded sequences, loaded GFF, ...)
> > >
> > > If I now go the the REST API on the address: GET info/species I can see my new specie. Also if I connect to the database and check the tables gene, transcript and exon, everything looks fine.
> > >
> > > Now my issue: I have an entry in the gene table of my custom specie. The gene has the stable_id :123456 (just some ID). If I go now to the REST API and lookup for this id:
> > >
> > > GET lookup/id/123456
> > >
> > > I do not retrieve the gene. It seems like I have missed something.
> > >
> > > Any ideas?
> > >
> > > regads,
> > > David
> > >
> > > --
> > > David Herzig
> > > Scientific Application Developer
> > > SIAD Solution Delivery & Architecture, pRED Informatics
> > > Roche Pharma Research and Early Development
> > > Roche Innovation Center Basel
> > >
> > > F. Hoffmann-La Roche Ltd
> > > Grenzacherstrasse 124
> > > 4070 Basel
> > > Switzerland
> > > Phone +41 61 687 31 70
> > > Learn more about pRED Informatics at http://go.roche.com/pREDi
> > >
> > > _______________________________________________
> > > Dev mailing list    Dev at ensembl.org
> > > Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> > > Ensembl Blog: http://www.ensembl.info/
> >
> >
> > _______________________________________________
> > Dev mailing list    Dev at ensembl.org
> > Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> > Ensembl Blog: http://www.ensembl.info/
> >
> >
> >
> > --
> > David Herzig
> > Scientific Application Developer
> > SIAD Solution Delivery & Architecture, pRED Informatics
> > Roche Pharma Research and Early Development
> > Roche Innovation Center Basel
> >
> > F. Hoffmann-La Roche Ltd
> > Grenzacherstrasse 124
> > 4070 Basel
> > Switzerland
> > Phone +41 61 687 31 70
> > Learn more about pRED Informatics at http://go.roche.com/pREDi
> >
> > _______________________________________________
> > Dev mailing list    Dev at ensembl.org
> > Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> > Ensembl Blog: http://www.ensembl.info/
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
> 
> 
> 
> -- 
> David Herzig
> Scientific Application Developer
> SIAD Solution Delivery & Architecture, pRED Informatics
> Roche Pharma Research and Early Development
> Roche Innovation Center Basel
> 
> F. Hoffmann-La Roche Ltd
> Grenzacherstrasse 124
> 4070 Basel
> Switzerland
> Phone +41 61 687 31 70
> Learn more about pRED Informatics at http://go.roche.com/pREDi
> 





More information about the Dev mailing list