[ensembl-dev] variation set method get all features by slice returns unordered features?
Andrea Edwards
edwardsa at cs.man.ac.uk
Mon Dec 27 23:52:18 GMT 2010
On 27/12/2010 23:50, Andrea Edwards wrote:
> Hi
>
> I only wanted them sorted so I could easily resume the retrieval
> process where it failed if my connection to the ensembl database was
> lost again. Alternatively i could modify the code to catch the lost
> connection and 'sleep' before trying to reconnect to the database.
>
> The error seemed to occur while loading one of the vf feature
> properties (stupidly i've lost the error message). Is this because the
> properties are lazy loaded (hence you need to connect to the database
> to retrieve them even once you have the object)? Are properties lazy
> loaded one by one or all at the same time? For example, when i used to
> write lazy loaded objects i wouldn't load any properties on
> instantiation but as soon as the user requested one property, i got
> the whole lot from the database while i was there. Some lazy loaded
> objects get each property as it is rquested. How does ensembl do its
> lazy loading? I've read the section on this page but it didn't give
> the mechanism?
> http://www.ensembl.org/info/docs/api/core/core_tutorial.html
>
> Given that I'm retrieving such a lot of objects and properties i might
> be able to speed my code up if I understand the lazy loading (i keep
> meaning to change from foreach loops to shift loops but its habit!)
>
> thanks
>
>
>
> On 27/12/2010 20:46, Pontus Larsson wrote:
>> Hi Andrea,
>>
>> The API doesn't return variation features in any guaranteed order. If
>> you need to have the features sorted, I would suggest that you sort
>> them via a customized sort command after fetching them. Something
>> like this should work:
>>
>> my @sorted_vfs = sort { $a->start() <=> $b->start() } @vfs;
>>
>> HTH
>> /Pontus
>>
>>
>> On 27/12/2010 20:50, Andrea Edwards wrote:
>>> Hi
>>>
>>> I have this code to get watson snps:
>>>
>>> ================================================================
>>>
>>> my @unsorted_slices = @{ $slice_adaptor->fetch_all('chromosome',
>>> undef, 0, 1) };
>>>
>>> my @sorted_slices = sort by_num_then_letter @unsorted_slices;
>>>
>>> # Base pair overlap between returned slices
>>> my $overlap = 0;
>>>
>>> # Maximum size of returned slices
>>> my $max_size = 10000;
>>> # Maximum size of returned slices
>>> my $max_size = 10000;
>>>
>>> # Break chromosomal slices into smaller 100k component slices
>>> my @slices = @{split_Slices( \@sorted_slices, $max_size, $overlap ) };
>>> my $snp_id = 0;
>>> foreach my $slice (@slices) {
>>>
>>> unless ($slice->seq_region_name() =~ /Un/) {
>>> my @vfs
>>> =@{$watson_set->get_all_VariationFeatures_by_Slice($slice)};
>>> foreach my $vf (@vfs) {
>>> next if ($vf->var_class ne 'snp');
>>> $snp_id++;
>>>
>>> #insert snp to my database
>>>
>>> =============================================================================
>>>
>>>
>>> @sorted_slices contains chromosome slices sorted into numerical then
>>> alphabetical order, though this doesn't really matter
>>> as my code didn't get past chromosome 1 before it lost connection to
>>> the server.
>>>
>>> I then tried to start the code from the chromosome segment where it
>>> had left off and i found that the snps weren't in 'base pair' order
>>> in the database. For example, on chromosome 1 the first snp was at
>>> locus 4626 and the next snp was at base 1081. (this position was
>>> obtained from $locus = $vf->start();)
>>>
>>> So it appears to me that the method
>>> get_all_VariationFeatures_by_Slice($slice) on the variation set
>>> object does not return snps in base order. Is this correct? How do i
>>> retrieve them in order?
>>>
>>> Thanks
>>>
>>> _______________________________________________
>>> Dev mailing list
>>> Dev at ensembl.org
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>
>
More information about the Dev
mailing list