[ensembl-dev] Mapping coordinates with TranscriptMapper

Reece Hart reece at harts.net
Fri Mar 25 00:59:01 GMT 2011


Greetings-

I'm running into an off-by-one problem when mapping between chromosome,
contig, CDS, and protein coordinates using TranscriptMapper. This might be
as simple as just needing to +1 to cdna2genomic(), but I'd appreciate
verification since I don't have to do this for genomic2pep() and I would
have expected both to be one-based.

A full script is attached, but here's the crux:

my ($tx) = @{ $ta->fetch_all_by_external_name($tx_ac) };
my $cds_start = $tx->cdna_coding_start;
...
my $tm = new Bio::EnsEMBL::TranscriptMapper($tx);
my ($tm_g) = $tm->cdna2genomic( $cds_pos+$cds_start,$cds_pos+$cds_start );
my ($tm_p) = $tm->genomic2pep( $tm_g->start,$tm_g->start,$tx->strand );

I'm testing against SNP coords pulled from  NCBI varvu and Ensembl Variation
summary. For 5 SNPs (of 5 tested), the computed genomic coordinate
($tm_g->start) is 1 less than expected, whereas the protein coordinates are
correct. The attached script demonstrates more.

All advice (or relevant code!) appreciated. Thanks for any help.

-Reece
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20110324/742eb168/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: simpleton
Type: application/octet-stream
Size: 2408 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20110324/742eb168/attachment.obj>


More information about the Dev mailing list