[ensembl-dev] using the GRCh37 coordinate system inside current ensembl human core

Andy Yates ayates at ebi.ac.uk
Fri Apr 17 13:49:46 BST 2015


Hi Vivek

The lurking coordinate system is there to hang mappings between the 
various assemblies in the core schema off. That allows the transform() 
and transfer() calls in the API to work when moving annotation between 
assemblies. There's also been some work on populating the GRCh38 
database with sequence information from all previous assemblies. You can 
see this in action on our REST API e.g.

http://rest.ensembl.org/sequence/region/human/X:153795306-153795365.fasta?coord_system_version=GRCh37

vs.

http://rest.ensembl.org/sequence/region/human/X:153795306-153795365.fasta?coord_system_version=GRCh38

HTH

Andy

vvi wrote:
> Thanks.
>
> I was hoping for an explanation of the GRCh37 coord system 'lurking'
> inside the mapping tables in port 5306 v79 database, but this also works :-)
>
> Cheers
>
> Vivek
>
> On 2015-04-17 12:48, Emily Perry wrote:
>
>> Hi Vivek
>>
>> You can use the e79 API with port 3337 to connect to the up-to-date
>> GRCh37 annotation.
>>
>> All the best
>>
>> Emily
>>
>> On 17/04/2015 11:56, Vivek Iyer wrote:
>>> Hi all, I have a set of genomic positions in GRCh37 (corresponding to
>>> virus insertion sites, which were found by someone else, mapping
>>> genomic sequence to the GRCh37 assembly). I am simply annotating
>>> these insertions sites with human ensembl genes - using some suitable
>>> flanking distance. I think I have two choices, which seem slightly
>>> different: (1) connect with my API to ensembl release 75 - the last
>>> GRCh37 release - and do everything inside that universe. Load slices
>>> given by my reference positions, read off the genes, bob’s your
>>> uncle. (2) connect with the API to the current ensembl release (79,
>>> GRCh38). Load slices in coordinate_system = chromosome, version =
>>> ‘GRCh37’ around my reference positions. Then read off genes on those
>>> slices as I need. So I think the difference is that (2) will get me
>>> annotations actually done on GRCh38 ‘pulled back’ into GRCh37,
>>> whereas (1) will get me annotations actually done on GRCh37 from
>>> end-to-end. Is this correct? I’d prefer to use (2) unless someone
>>> tells me that’s a baaad idea. Thanks, Vivek
>>> _______________________________________________ Dev mailing list
>>> Dev at ensembl.org <mailto:Dev at ensembl.org> Posting guidelines and
>>> subscribe/unsubscribe info:
>>> http://lists.ensembl.org/mailman/listinfo/dev Ensembl Blog:
>>> http://www.ensembl.info/
>> --
>> Dr Emily Perry (Pritchard)
>> Ensembl Outreach Project Leader
>>
>> European Bioinformatics Institute (EMBL-EBI)
>> European Molecular Biology Laboratory
>> Wellcome Trust Genome Campus
>> Hinxton
>> Cambridge
>> CB10 1SD
>> UK
>>
>>
>> _______________________________________________
>> Dev mailing listDev at ensembl.org  <mailto:Dev at ensembl.org>
>> Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog:http://www.ensembl.info/
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-- 
Andrew Yates - Genomics Technology Infrastructure Team Leader
European Molecular Biology Laboratory
European Bioinformatics Institute
Wellcome Trust Genome Campus
Hinxton, Cambridge
CB10 1SD, United Kingdom
Tel: +44-(0)1223-492538
Fax: +44-(0)1223-494468
Skype: andrewyatz
http://www.ensembl.org/




More information about the Dev mailing list