[ensembl-dev] flushing slice data from cache when using the ensembl perl API

David Gacquer dgacquer at ulb.ac.be
Wed Oct 13 15:36:14 BST 2010


  Dear Patrick,

I have just replaced the while/shit loop by a foreach as you advised me, 
and it worked perfectly. I can still use a single registry with caching 
enabled and get all overlapping genes/transcripts. Thanks for your 
answer and for the technical explanation about caching issues when using 
while/shift loops.

Best regards

David

On 13/10/10 15:46, Patrick Meidl wrote:
> On Wed, Oct 13 2010, David Gacquer<dgacquer at ulb.ac.be>  wrote:
>
>> my $overlapping_genes = $slice->get_all_Genes();
>> print "number of overlapping genes: ".scalar @{$overlapping_genes}."\n";
>> while ( my $gene = shift @{$overlapping_genes} ) {
>>        print "gene stable id: ".$gene->stable_id()."\n";
>> ...
>> }
>>
>> I have a particular issue when the same genomic position appears
>> several times in a row. All overlapping genes and transcripts are
>> correctly found only the first time.
> I haven't tested it, but I think if you change line 3 of the above code
> snippet to
>
>      foreach my $gene ( @{$overlapping_genes} ) {
>
> it should work as expected.
>
> the rational is this (but as I said, I didn't test it so I might be
> wrong):
>
> $slice->get_all_Genes() returns an array reference. the reference
> actually refers to the slice feature cache, which is (AFAIR) a global
> cache in the SliceAdaptor (so if you retrieve the same slice twice,
> there is only one copy of each feature in the feature cache). if you
> shift the feature array, you remove the first element from the list, and
> since the list references the cache, you remove the feature from the
> cache. so next time you use this slice, the feature will no longer be
> there.
>
> using a foreach loop rather than while/shift, you don't touch the cache,
> so your code should be fine.
>
> in general, while/shift sometimes improves performance (since you shrink
> your dataset as you process it), but there's always the risk of shared
> reference gotchas.
>
> HTH
>      patrick
>


-- 
David Gacquer, Ph. D.

IRIBHM - Universite Libre de Bruxelles
Bldg C, room C.4.117
ULB, Campus Erasme, CP602
808 route de Lennik
B-1070 Brussels
Belgium

Phone: +32-2-555 4187
Fax: +32-2-555 4655
E-mail: dgacquer at ulb.ac.be





More information about the Dev mailing list