[ensembl-dev] VariationAdaptor::fetch_all_by_Individual not completely deprecated

Ma, Man Chun John manchunjohn-ma at uiowa.edu
Mon Oct 25 16:56:39 BST 2010


Sorry for the confusion. I meant

John MC Ma
Graduate Assistant
Kwitek Lab
Department of Internal Medicine
3125E MERF
375 Newton Road
Iowa City IA 52242
-----Original Message-----
From: wmclaren at gmail.com [mailto:wmclaren at gmail.com] On Behalf Of Will
Sent: Monday, October 25, 2010 4:02 AM
To: Ma, Man Chun John
Cc: dev at ensembl.org
Subject: Re: [ensembl-dev] VariationAdaptor::fetch_all_by_Individual not
completely deprecated

Hi John,

I'm not entirely sure I know which methods you are referring to -
neither VariationAdaptor::fetch_all_by_Individual or
CompressedGenotypeAdaptor::fetch_all_by_Individual exist in our current

You are correct, accessing individual genotypes can be slow; this is
because the genotypes are compressed into chunks containing all
genotypes in a 100kb block. Therefore if you are randomly accessing
genotypes from different blocks this can be slow as each time the API
has to retrieve a 100kb block of genotypes and filter out the
appropriate one(s). The schema is optimised for retrieving large numbers
of genotypes along a contiguous genomic region, so if you configure your
scripts to access data in a similar way you should see speed increases.

This is an issue we are looking into, especially since the number of
genotypes we store is increasing rapidly - the latest release of Ensembl
human contains ~2.5 billion genotypes!


Will McLaren
Ensembl Variation

On 22 October 2010 19:26, Ma, Man Chun John <manchunjohn-ma at uiowa.edu>
> Hi,
> I have noticed that CompressedGenotypeAdaptor::fetch_all_by_Individual
> has been deprecated as of August 2007
> (http://listserver.ebi.ac.uk/mailing-lists-archives/ensembl-dev/msg031
> 96 .html), yet the method VariationAdaptor::fetch_all_by_Individual, 
> which directly calls 
> CompressedGenotypeAdaptor::fetch_all_by_Individual, still exists to 
> this day. There is no reason to keep a non-functional method in 
> Ensembl-Variation, and should be removed.
> On the other hand, I have noticed that, probably due to the said 
> database restructuring, accessing IndividualGenotypes through the 
> Ensembl API is very slow-- are there any way that I can improve the 
> speed?
> Any help is appreciated!
> Cheers,
> John MC Ma
> Graduate Assistant
> Kwitek Lab
> Department of Internal Medicine
> 3125E MERF
> 375 Newton Road
> Iowa City IA 52242
> _______________________________________________
> Dev mailing list
> Dev at ensembl.org
> http://lists.ensembl.org/mailman/listinfo/dev

More information about the Dev mailing list