[ensembl-dev] Issues in LDFeatureContainerAdaptor

Anja Thormann anja at ebi.ac.uk
Wed Feb 24 15:43:47 GMT 2016


I pushed a bug fix to the ensembl-variation release/83 branch. Could you please update your ensembl-variation git repo (pull the changes)?

You might also need to include failed variants. This is similar to setting the use_vcf flag:

$variation_adaptor->db->include_failed_variations(1);

Please let me know if you continue having problems running the code.

Regards,
Anja

> On 24 Feb 2016, at 14:57, Johanne Håøy Horn <johannhh at ifi.uio.no> wrote:
> 
> Thank you for your swift reply!
> 
> When I try to run the script you gave me, I get the following printout, ending with the same warnings/errors as I had before:
> 
> $ perl ensemblLD.pl
> 1000GENOMES:phase_3:ACB
> 1000GENOMES:phase_3:ASW
> 1000GENOMES:phase_3:BEB
> 1000GENOMES:phase_3:CDX
> 1000GENOMES:phase_3:CEU
> 1000GENOMES:phase_3:CHB
> 1000GENOMES:phase_3:CHS
> 1000GENOMES:phase_3:CLM
> 1000GENOMES:phase_3:ESN
> 1000GENOMES:phase_3:FIN
> 1000GENOMES:phase_3:GBR
> 1000GENOMES:phase_3:GIH
> 1000GENOMES:phase_3:IBS
> 1000GENOMES:phase_3:ITU
> 1000GENOMES:phase_3:JPT
> 1000GENOMES:phase_3:KHV
> 1000GENOMES:phase_3:LWK
> 1000GENOMES:phase_3:MAG
> 1000GENOMES:phase_3:MSL
> 1000GENOMES:phase_3:MXL
> 1000GENOMES:phase_3:PEL
> 1000GENOMES:phase_3:PJL
> 1000GENOMES:phase_3:PUR
> 1000GENOMES:phase_3:STU
> 1000GENOMES:phase_3:TSI
> 1000GENOMES:phase_3:YRI
> 1000GENOMES:phase_3:ACB
> Use of uninitialized value $gt[1] in hash element at /Users/Johanne/src/ensembl-variation/modules/Bio/EnsEMBL/Variation/DBSQL/LDFeatureContainerAdaptor.pm line 716.
> Use of uninitialized value $gt[1] in hash element at /Users/Johanne/src/ensembl-variation/modules/Bio/EnsEMBL/Variation/DBSQL/LDFeatureContainerAdaptor.pm line 716.
> Use of uninitialized value $gt[1] in hash element at /Users/Johanne/src/ensembl-variation/modules/Bio/EnsEMBL/Variation/DBSQL/LDFeatureContainerAdaptor.pm line 716.
> Use of uninitialized value in string ne at /Users/Johanne/src/ensembl-variation/modules/Bio/EnsEMBL/Variation/DBSQL/LDFeatureContainerAdaptor.pm line 617.
> Use of uninitialized value in string ne at /Users/Johanne/src/ensembl-variation/modules/Bio/EnsEMBL/Variation/DBSQL/LDFeatureContainerAdaptor.pm line 617.
> Use of uninitialized value in string ne at /Users/Johanne/src/ensembl-variation/modules/Bio/EnsEMBL/Variation/DBSQL/LDFeatureContainerAdaptor.pm line 617.
> Can't call method "get_all_VariationFeatures" on an undefined value at /Users/Johanne/src/ensembl-variation/modules/Bio/EnsEMBL/Variation/DBSQL/LDFeatureContainerAdaptor.pm line 870, <OUT> line 65174.
> 
> I use version 83 of the API, very recently downloaded and set up. Perhaps it is my installation locally that is the problem, not the code… Do you have any idea on what I might have done wrong? I have double checked that I have compiled the calc_genotype file in src/ensemble-variation/C_code, and included the src/ensemble-variation/C_code path in my PERL5LIB variable. Is there any other dependencies specifically related to the LDFeatureContainer that I should check is correctly set up?
> 
> Best,
> Johanne Håøy Horn
> 
>> 24. feb. 2016 kl. 14.56 skrev Anja Thormann <anja at ebi.ac.uk <mailto:anja at ebi.ac.uk>>:
>> 
>> Dear Johanne,
>> 
>> I would recommend using the LDFeatureContainerAdaptor. I have written a small script to show you how to use the adaptor. In order to avoid exceeding the number of genotypes as is printed in the error message, you should define a population and variation feature for which you want to compute LD data. I print all the populations for which we can compute LD data in the beginning of the script. We have been working on speeding up our LD computation. The improvements are going into the next release/84 which will be out in March.
>> 
>> ..
>> my $variation_adaptor = $registry->get_adaptor('homo_sapiens', 'variation', 'variation' );
>> my $ldfc_adaptor = $registry->get_adaptor('homo_sapiens', 'variation', 'ldfeaturecontainer');
>> my $population_adaptor = $registry->get_adaptor('homo_sapiens', 'variation', 'population');
>> $variation_adaptor->db->use_vcf(1); # To get 1000G phase 3 data also
>> 
>> my $ld_populations = $population_adaptor->fetch_all_LD_Populations();
>> foreach my $ld_population (@$ld_populations) {
>>   print $ld_population->name, "\n";
>> }
>> 
>> my $variation_name = 'rs157580';
>> my $variation = $variation_adaptor->fetch_by_name($variation_name);
>> my @vfs = @{ $variation->get_all_VariationFeatures() };
>> 
>> foreach my $vf (@vfs) {
>>   foreach my $ld_population (@$ld_populations) {
>>     print $ld_population->name, "\n";
>>     my $ldfc = $ldfc_adaptor->fetch_by_VariationFeature($vf, $ld_population);
>>     foreach my $ld_hash (@{$ldfc->get_all_ld_values}) {
>>       my $d_prime = $ld_hash->{d_prime};
>>       my $r2 = $ld_hash->{r2};
>>       my $variation_name1 = $ld_hash->{variation1}->variation_name;
>>       my $variation_name2 = $ld_hash->{variation2}->variation_name;
>>       print "$variation_name1 $variation_name2 d_prime=$d_prime r2=$r2\n";
>>     }
>>   }
>> }
>> 
>> Regards,
>> Anja
>> 
>> 
>>> On 24 Feb 2016, at 13:05, Johanne Håøy Horn <johannhh at ifi.uio.no <mailto:johannhh at ifi.uio.no>> wrote:
>>> 
>>> Dear Ensembl team,
>>> 
>>> Thank you for previous help with setting up the API!
>>> 
>>> I am now able to use the API properly using the scripts of the tutorial pages of LD available from the ensembl blog and web page.
>>> 
>>> What I now try to do, is to expand a list of tag/index SNPs from GWAS to include the SNPs in LD with the input SNPs. The code I have is the following:
>>> 
>>> use strict;
>>> use warnings;
>>> use Bio::EnsEMBL::Registry;
>>> 
>>> my $registry = 'Bio::EnsEMBL::Registry';
>>> 
>>> $registry->load_registry_from_db(
>>>   -host => 'ensembldb.ensembl.org <http://ensembldb.ensembl.org/>',
>>>   -user => 'anonymous'
>>>     );
>>> 
>>> my $variation_adaptor = $registry->get_adaptor('homo_sapiens', 'variation', 'variation' );
>>> $variation_adaptor->db->use_vcf(1); # To get 1000G phase 3 data also
>>> 
>>> while (<>) {
>>>     chomp; # Remove \n from input file line names
>>>     my $variation = $variation_adaptor->fetch_by_name($_);
>>>     print $variation->stable_id(), "\n";
>>> 
>>>     my @vfs = @{ $variation->get_all_VariationFeatures() };
>>> 
>>>     foreach my $vf (@vfs){
>>>         print "get ld values\n";
>>>         my $ld = $vf->get_all_LD_values();
>>>         print "get ld variations\n";
>>>         my @ldvs = @{ $ld->get_variations() };
>>> 
>>>         print "for each ld variation\n";
>>>         foreach my $ldv (@ldvs) {
>>>             print $ldv->stable_id();
>>>         }
>>>     }   
>>> }
>>> 
>>> When calling $vf->get_all_LD_values I get the following error/warning:
>>> Use of uninitialized value $gt[1] in hash element at /Users/Johanne/src/ensembl-variation/modules/Bio/EnsEMBL/Variation/DBSQL/LDFeatureContainerAdaptor.pm line 716.
>>> ...
>>> Use of uninitialized value in string ne at /Users/Johanne/src/ensembl-variation/modules/Bio/EnsEMBL/Variation/DBSQL/LDFeatureContainerAdaptor.pm line 617.
>>> ...
>>> Number of genotypes supported by the program (500) exceeded
>>> ...
>>> Can't call method "get_all_VariationFeatures" on an undefined value at /Users/Johanne/src/ensembl-variation/modules/Bio/EnsEMBL/Variation/DBSQL/LDFeatureContainerAdaptor.pm line 870, <OUT> line 29120.
>>> 
>>> The «…» means the warning/error is printed multiple times in a row.
>>> 
>>> I have described the issue in further detail here: https://www.biostars.org/p/178467/ <https://www.biostars.org/p/178467/>
>>> 
>>> Could you help me with what I am doing wrong? Also, is there a better way of finding all SNPs in LD with an input SNP than the one I am trying to do? Speed seems to be an issue when the number of SNPs get large.
>>> 
>>> Best
>>> Johanne Håøy Horn
>>> 
>>> 
>>> _______________________________________________
>>> Dev mailing list    Dev at ensembl.org <mailto:Dev at ensembl.org>
>>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev <http://lists.ensembl.org/mailman/listinfo/dev>
>>> Ensembl Blog: http://www.ensembl.info/ <http://www.ensembl.info/>
>> 
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org <mailto:Dev at ensembl.org>
>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev <http://lists.ensembl.org/mailman/listinfo/dev>
>> Ensembl Blog: http://www.ensembl.info/ <http://www.ensembl.info/>
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20160224/baf5a286/attachment.html>


More information about the Dev mailing list