[ensembl-dev] API LD calculation questions

andrew126 at mac.com andrew126 at mac.com
Mon Jul 18 17:18:48 BST 2016


Hi Anja,

Thanks for the quick response.

Yes, pulling a fresh ensembl-variation seems to have solved it .. smile.

Thanks also for confirming the handling of non-biallelics.

Best,

Andrew

> On Jul 18, 2016, at 11:55 AM, Anja Thormann <anja at ebi.ac.uk> wrote:
> 
> Hi Andrew,
> 
> I think we fixed the bug you are describing in 2). Could you please pull the changes from ensembl-variation/release-84 branch and check if that is solving the problem?
> 
> For 1) yes, we are calculating LD only for bi-allelic variants at the moment. It would definitely be great to extend the computation to multi-allelic variants. We will keep you updated about any progress we are going to do with this.
> 
> Regards,
> Anja
> 
> 
>> On 17 Jul 2016, at 11:19, andrew126 at mac.com <mailto:andrew126 at mac.com> wrote:
>> 
>> Hi,
>> 
>> I am using API 84 on Ubuntu (64-bit).
>> 
>> 1)
>> 
>> Is it correct that LD will not be calculated for any variants having more than 2 alleles?  (It looks like being biallelic is a very hard coded requirement of calc_genotypes.c)
>> 
>> 2)
>> 
>> I am confused by some discrepancies I see with the Ensembl LD calculations vrs. other LD calculators.
>> 
>> As an example, imagine I want all variants in high LD with rs13181561 that occur in a particular slice.
>> 
>> LD calculations using the below perl API script find an r^2 of >0.6 between rs13181561 and rs372693892.
>> 
>> This is confusing because no other LD calculators find this result, and Ensembl's web page on the variant indicates that no sample genotypes are available:
>> 
>> http://useast.ensembl.org/Homo_sapiens/Variation/Explore?r=5:139447545-139448545;v=rs372693892;vdb=variation;vf=54723062 <http://useast.ensembl.org/Homo_sapiens/Variation/Explore?r=5:139447545-139448545;v=rs372693892;vdb=variation;vf=54723062>
>> 
>> Further, the script does NOT find rs7446197 as being in high LD with rs13181561 (other engines find it to have r^2 > 0.6).
>> 
>> I do not know if there is any complication because both rs372693892 and rs7446197 have the same genomic location?
>> 
>> Can you provide some guidance on this?
>> 
>> Please let me know if any other information would be useful.
>> 
>> Thanks very much.
>> 
>> Best,
>> 
>> Andrew
>> 
>> 
>> 
>> use strict;
>> $|=1;
>> use IPC::Run;
>> use Bio::EnsEMBL::Registry;
>> 
>> use Bio::EnsEMBL::ApiVersion; 
>> printf( "The API version used is %s\n", software_version() ); 
>> 
>> my $registry = 'Bio::EnsEMBL::Registry';
>> $registry->load_all();
>> $registry->set_reconnect_when_lost(1);
>> $registry->load_registry_from_db(
>>     -host => 'useastdb.ensembl.org <http://useastdb.ensembl.org/>',
>>     -user => 'anonymous'
>>     );
>> 
>> my $slice_adaptor = $registry->get_adaptor('human', 'core', 'slice');
>> my $population_adaptor = $registry->get_adaptor('human', 'variation', 'population');
>> my $ldFeatureContainerAdaptor = $registry->get_adaptor('human', 'variation', 'ldfeaturecontainer');
>> $ldFeatureContainerAdaptor->db->use_vcf(1);
>> $registry->set_reconnect_when_lost(1);
>> 
>> my $pop = "1000GENOMES:phase_3:AMR";
>> my $chr = 5;
>> my $low = 139445000;
>> my $high = 139475000;
>> 
>> my $slice = $slice_adaptor->fetch_by_region('chromosome',$chr,$low,$high); 
>> my $population = $population_adaptor->fetch_by_name($pop);
>> my $ldFeatureContainer = $ldFeatureContainerAdaptor->fetch_by_Slice($slice,$population);
>> foreach my $r_square (@{$ldFeatureContainer->get_all_r_square_values}){
>>     if ($r_square->{r2} >= 0.05){ 
>> 	if ($r_square->{variation1}->variation_name eq "rs13181561" || $r_square->{variation2}->variation_name eq "rs13181561") {
>> 	    print "High r2 (".($r_square->{r2}).") between variations ", $r_square->{variation1}->variation_name, "-", $r_square->{variation2}->variation_name, " in $pop\n";
>> 	}
>>     }
>> }
>> 
>> Output from script:
>> 
>> High r2 (0.692506) between variations rs11954057-rs13181561 in 1000GENOMES:phase_3:AMR
>> High r2 (0.625299) between variations rs372693892-rs13181561 in 1000GENOMES:phase_3:AMR
>> High r2 (0.900663) between variations rs111805170-rs13181561 in 1000GENOMES:phase_3:AMR
>> High r2 (0.686462) between variations rs10463977-rs13181561 in 1000GENOMES:phase_3:AMR
>> High r2 (0.264367) between variations rs28419191-rs13181561 in 1000GENOMES:phase_3:AMR
>> High r2 (0.375759) between variations rs7380062-rs13181561 in 1000GENOMES:phase_3:AMR
>> High r2 (0.903044) between variations rs13181561-rs13153461 in 1000GENOMES:phase_3:AMR
>> High r2 (0.898459) between variations rs71574449-rs13181561 in 1000GENOMES:phase_3:AMR
>> High r2 (0.883522) between variations rs13181561-rs55792153 in 1000GENOMES:phase_3:AMR
>> High r2 (0.360986) between variations rs73257323-rs13181561 in 1000GENOMES:phase_3:AMR
>> High r2 (0.898459) between variations rs9716069-rs13181561 in 1000GENOMES:phase_3:AMR
>> High r2 (0.702607) between variations rs763007006-rs13181561 in 1000GENOMES:phase_3:AMR
>> High r2 (0.533529) between variations rs34149439-rs13181561 in 1000GENOMES:phase_3:AMR
>> High r2 (0.096040) between variations rs114079768-rs13181561 in 1000GENOMES:phase_3:AMR
>> High r2 (0.286249) between variations rs10875554-rs13181561 in 1000GENOMES:phase_3:AMR
>> High r2 (0.853556) between variations rs34580585-rs13181561 in 1000GENOMES:phase_3:AMR
>> High r2 (0.702607) between variations rs36137978-rs13181561 in 1000GENOMES:phase_3:AMR
>> High r2 (0.088433) between variations rs140049780-rs13181561 in 1000GENOMES:phase_3:AMR
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org <mailto:Dev at ensembl.org>
>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev <http://lists.ensembl.org/mailman/listinfo/dev>
>> Ensembl Blog: http://www.ensembl.info/ <http://www.ensembl.info/>
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20160718/9124acb1/attachment.html>


More information about the Dev mailing list