[ensembl-dev] Performance issues getting spliced sequence from Bio::EnsEMBL::Transcript
Matt Wood
matt.wood at codifiedgenomics.com
Wed Nov 12 22:39:13 GMT 2014
I'm working on a VEP plugin where I need to look at a section of cDNA
around the variant.
In a previous plugin, where I needed to do something similar with genomic
DNA, I was able to get a slice from the VariationFeature and subslice it
like this:
my $subseq = $vf->slice->sub_Slice($start, $end)->seq;
That worked really well and performed really well.
I can't find anything similar for the cDNA so I'm getting the spliced
sequence from the transcript and then using substr() to do what sub_Slice
did above.
my $cdna_seq = $transcript->spliced_seq;
my $subseq = substr($cdna_seq, $start, $end);
It works well enough, but performance is too poor to be useful, taking 2 or
3 seconds to get $subseq per transcript. I'm wondering if I'm going about
things the wrong way and am skipping a cache or something with the methods
I'm using.
Any ideas for how I can get better performance? Is there a better way to
get a chunk of a transcript's spliced sequence?
Thanks,
Matt Wood
Codified Genomics
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20141112/b59bee7f/attachment.html>
More information about the Dev
mailing list