[ensembl-dev] Performance issues getting spliced sequence from Bio::EnsEMBL::Transcript

Will McLaren wm2 at ebi.ac.uk
Fri Nov 14 05:07:08 GMT 2014


Hi Matt,

I assume you're already using a FASTA file (
http://www.ensembl.org/info/docs/tools/vep/script/vep_cache.html#fasta).

There does seem to be some issue with the sequence fetching code, however.
If you are using the VEP's --offline flag then the issue doesn't appear (at
least for me). If you use --cache (which asks the code to prefer use of
offline resources including the FASTA file but still allows connections to
the DB) then in this particular case calling that method seems to bypass
the code that fetches sequence from the FASTA file and instead fetches it
from the DB.

While we look into this issue, can you possibly try using --offline if you
aren't already?

Regards

Will McLaren
Ensembl Variation

On 13 November 2014 07:39, Matt Wood <matt.wood at codifiedgenomics.com> wrote:

> I'm working on a VEP plugin where I need to look at a section of cDNA
> around the variant.
>
> In a previous plugin, where I needed to do something similar with genomic
> DNA, I was able to get a slice from the VariationFeature and subslice it
> like this:
>
> my $subseq = $vf->slice->sub_Slice($start, $end)->seq;
>
> That worked really well and performed really well.
>
> I can't find anything similar for the cDNA so I'm getting the spliced
> sequence from the transcript and then using substr() to do what sub_Slice
> did above.
>
> my $cdna_seq = $transcript->spliced_seq;
> my $subseq = substr($cdna_seq, $start, $end);
>
> It works well enough, but performance is too poor to be useful, taking 2
> or 3 seconds to get $subseq per transcript. I'm wondering if I'm going
> about things the wrong way and am skipping a cache or something with the
> methods I'm using.
>
> Any ideas for how I can get better performance? Is there a better way to
> get a chunk of a transcript's spliced sequence?
>
> Thanks,
> Matt Wood
> Codified Genomics
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20141114/d67b77ba/attachment.html>


More information about the Dev mailing list