[ensembl-dev] getting gene exons and transcripts that overlap only the original slice

Pablo Marin-Garcia pg4 at sanger.ac.uk
Wed Jan 12 11:11:10 GMT 2011


On Wed, 12 Jan 2011, Alison Meynert wrote:

> Just putting this onto the list from Damian @ Ensembl.
>
> On 12/01/2011 10:38, Damian Keefe wrote:
>> 
>> The short answer is don't use Mysql its hopeless for looking at overlaps.
>>

That is right, but using mysql as an indexed flat files for raw queries is 
convenient if you don't have large memory systems to hash the data. I handle 
overlapping in my modules after retrieving the data or with precalculated tables.

>> There is a script in the ensembl codebase
>> 
>> ensembl-functgenomics/scripts/miscellaneous/cooccur.pl

It seems to me that this is a shell wrapper handling all with system calls. 
Obviously it does the job, it would be interesting to profile it against my perl 
module methods for look_for_overlaping_features() process_overlapping_features() 
collapse_overlaping_features(). Probably shell wrapping is the way to go for 
very large datasets it would be fun to explore. It is always nice to see variaty 
and learn new approaches.

   -Pablo

>> 
>> which I use all the time for looking at the overlap of two large sets of
>> genomic features. Its pretty well documented. It might suit your purposes.
>> 
>> cheers
>> 
>> Damian
>> 
>> 
>> 
>
> -- 
> Alison Meynert
> MRC Human Genetics Unit, Edinburgh
> alison.meynert at hgu.mrc.ac.uk
>
> _______________________________________________
> Dev mailing list
> Dev at ensembl.org
> http://lists.ensembl.org/mailman/listinfo/dev
>


-----

   Pablo Marin-Garcia





More information about the Dev mailing list