[ensembl-dev] Mapping transcript coordinate to toplevel (genome)
Sung Gong
sung at bio.cc
Thu Nov 10 10:31:15 GMT 2011
Hi,
I wanted to convert transcript coordinates onto the top level using
Bio::EnsEMBL::TranscriptMapper.
Below is the script:
sub cds2genome{
my ($enst_id,$cds_from,$cds_to)=@_;
my $transcript=$tr_adaptor->fetch_by_stable_id($enst_id);
my $tm = Bio::EnsEMBL::TranscriptMapper->new($transcript);
my $chr=$transcript->transform('toplevel')?
$transcript->transform('toplevel')->slice->seq_region_name():'\N';
my $t_strand=defined $transcript->seq_region_strand?
$transcript->seq_region_strand : '+1';
my ($g_start,$g_end,$seq);
my $cdna_start=$cds_from + $transcript->cdna_coding_start - 1;
my $cdna_end=$cds_to + $transcript->cdna_coding_start - 1;
my @coord=$tm->cdna2genomic($cdna_start,$cdna_end);
foreach my $coord(@coord){
if($coord->isa('Bio::EnsEMBL::Mapper::Gap')){
warn
"\t[GAP]ENST:$enst_id\tCDS_START:$cds_from\tCDS_END:$cds_to\tG_START-END:",$coord->start,"-",$coord->end,"\n";
}else{
$g_start=$coord->start if defined $coord->start;
$g_end=$coord->end if defined $coord->end;
my $slice = $slice_adaptor->fetch_by_region('chromosome', $chr);
my $feature=Bio::EnsEMBL::Feature->new(
-start=>$g_start,
-end=>$g_end,
-strand=>$slice->strand, # 1 as a default
-slice=>$slice,
);
$seq=defined $feature->seq? $feature->seq : '\N';
}
}
return($chr,$g_start,$g_end,$seq,$t_strand);
}
The code reads very well I guess ;)
It works very well, except only one position shown below:
ENST00000155840 921 923
which returns:
chr:11 2604665 2604666
I am wondering why the length is different (921-923 & 2604665-2604666)?
Only one such a case out of more than 100 entries, which are OK.
Any thoughts?
Cheers,
Sung
More information about the Dev
mailing list