[ensembl-dev] 1-based coordinate system Vs 0-based coordinate system

Kieron Taylor ktaylor at ebi.ac.uk
Fri Dec 13 12:04:06 GMT 2013


Hi Duarte,

Ensembl is uniformly 1-based. I refer you to UCSC's FAQ on their 
coordinate system, where their approach is explained well:

http://genome.ucsc.edu/FAQ/FAQtracks.html#tracks1

They differentiate between database level and presentational level. I 
can only guess that their Blat server reports "presentation coordinates" 
instead of their true internal coordinates.

You need only be concerned with the UCSC internal coordinate system when 
you interact with specific file formats, or a UCSC database directly. 
Ensembl would be concerned if our website did NOT report the same 
sequence and coordinates as the UCSC web tools.

Regards,

-- 
Kieron Taylor PhD.
Ensembl Core team
EBI


On 13/12/2013 11:46, Duarte Molha wrote:
> Dear Developers
>
> I was wondering if you could give me a quick explanation on this.
>
> I am aware that ensembl uses 1-based coordinate system and UCSC uses
> 0-based...
>
> However... consider this sequence:
>
> ctaagccacaccataactgacttctaggcattcatctttcttccacttaaattcattctc
>
> When I blated this sequence, both UCSC and Ensembl blat servers gave be
> the exact same coordinate for the hit:
>
> chr14:79498951-79499010
>
> However I would expect, if I put this exact coordinates in ensembl
> browser and in UCSC genome browser the sequence retrieved would be
> trimmed by one base in ENSEMBL.
>
> However this is not the case. Can you explain me why this is?
>
> I am sure you probably have already explained this before so I apologize
> beforehand if this is a dumb question.
>
> Best regards
>
> Duarte
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>







More information about the Dev mailing list