[ensembl-dev] Issues in LDFeatureContainerAdaptor

Johanne Håøy Horn johannhh at ifi.uio.no
Wed Feb 24 13:05:04 GMT 2016

Dear Ensembl team,

Thank you for previous help with setting up the API!

I am now able to use the API properly using the scripts of the tutorial pages of LD available from the ensembl blog and web page.

What I now try to do, is to expand a list of tag/index SNPs from GWAS to include the SNPs in LD with the input SNPs. The code I have is the following:

use strict;
use warnings;
use Bio::EnsEMBL::Registry;

my $registry = 'Bio::EnsEMBL::Registry';

  -host => 'ensembldb.ensembl.org<http://ensembldb.ensembl.org>',
  -user => 'anonymous'

my $variation_adaptor = $registry->get_adaptor('homo_sapiens', 'variation', 'variation' );
$variation_adaptor->db->use_vcf(1); # To get 1000G phase 3 data also

while (<>) {
    chomp; # Remove \n from input file line names
    my $variation = $variation_adaptor->fetch_by_name($_);
    print $variation->stable_id(), "\n";

    my @vfs = @{ $variation->get_all_VariationFeatures() };

    foreach my $vf (@vfs){
        print "get ld values\n";
        my $ld = $vf->get_all_LD_values();
        print "get ld variations\n";
        my @ldvs = @{ $ld->get_variations() };

        print "for each ld variation\n";
        foreach my $ldv (@ldvs) {
            print $ldv->stable_id();

When calling $vf->get_all_LD_values I get the following error/warning:
Use of uninitialized value $gt[1] in hash element at /Users/Johanne/src/ensembl-variation/modules/Bio/EnsEMBL/Variation/DBSQL/LDFeatureContainerAdaptor.pm line 716.
Use of uninitialized value in string ne at /Users/Johanne/src/ensembl-variation/modules/Bio/EnsEMBL/Variation/DBSQL/LDFeatureContainerAdaptor.pm line 617.
Number of genotypes supported by the program (500) exceeded
Can't call method "get_all_VariationFeatures" on an undefined value at /Users/Johanne/src/ensembl-variation/modules/Bio/EnsEMBL/Variation/DBSQL/LDFeatureContainerAdaptor.pm line 870, <OUT> line 29120.

The «…» means the warning/error is printed multiple times in a row.

I have described the issue in further detail here: https://www.biostars.org/p/178467/

Could you help me with what I am doing wrong? Also, is there a better way of finding all SNPs in LD with an input SNP than the one I am trying to do? Speed seems to be an issue when the number of SNPs get large.

Johanne Håøy Horn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20160224/8b818a24/attachment.html>

More information about the Dev mailing list