[ensembl-dev] Issues with VariationFeature

Johanne Håøy Horn johannhh at ifi.uio.no
Thu Feb 25 09:57:00 GMT 2016


Dear Ensembl team,

When I do the following call: my @vf_pops = @{ $vf->get_all_LD_Populations() }; I get this error:
DBD::mysql::st execute failed: Unknown column 'ip.population_id' in 'field list' at /Users/Johanne/src/ensembl-variation/modules/Bio/EnsEMBL/Variation/VariationFeature.pm line 1429, <> line 1.DBD::mysql::st execute failed: Unknown column 'ip.population_id' in 'field list' at /Users/Johanne/src/ensembl-variation/modules/Bio/EnsEMBL/Variation/VariationFeature.pm line 1429, <> line 1.

Here’s the full script:
use strict;
use warnings;
use Bio::EnsEMBL::Registry;

my $start_run = time();

my $registry = 'Bio::EnsEMBL::Registry';

$registry->load_registry_from_db(
  -host => 'ensembldb.ensembl.org<http://ensembldb.ensembl.org>',
  -user => 'anonymous'
);
my $variation_adaptor = $registry->get_adaptor('homo_sapiens', 'variation', 'variation' );
my $ldfc_adaptor = $registry->get_adaptor('homo_sapiens', 'variation', 'ldfeaturecontainer');
my $population_adaptor = $registry->get_adaptor('homo_sapiens', 'variation', 'population');
$variation_adaptor->db->use_vcf(1); # To get 1000G phase 3 data also

my $ld_populations = $population_adaptor->fetch_all_LD_Populations();
foreach my $ld_population (@$ld_populations) {
    print $ld_population->name, "\n";
}

my $variation_name = 'rs157580';
my $variation = $variation_adaptor->fetch_by_name($variation_name);
my @vfs = @{ $variation->get_all_VariationFeatures() };

foreach my $vf (@vfs) {

  print $vf->name, "\n";
  my @vf_pops = @{ $vf->get_all_LD_Populations() };
  foreach my $ld_population (@$ld_populations) {
    print $ld_population->name, "\n";
    my $ldfc = $ldfc_adaptor->fetch_by_VariationFeature($vf, $ld_population);
    foreach my $ld_hash (@{$ldfc->get_all_ld_values}) {
      my $d_prime = $ld_hash->{d_prime};
      my $r2 = $ld_hash->{r2};
my $variation_name1 = $ld_hash->{variation1}->variation_name;
      my $variation_name2 = $ld_hash->{variation2}->variation_name;
      print "$variation_name1 $variation_name2 d_prime=$d_prime r2=$r2\n";
    }
  }
}

my $end_run = time();
my $run_time = $end_run - $start_run;
print "Job took $run_time seconds\n";

If I remove the call to get_all_LD_Populations, the script runs fine again. Do you have any idea on what I am doing wrong? Could it be a bug in the code, like the error I reported yesterday?

Also, I have visited a lot of forums where LD calculation is discussed. Many users ask for a database one can query to find LD between SNPs, and genomic LD tracks, but all such services are only available on HapMap data. Do you know why there is none for all SNPs and LD produced up until 1000G phase 3? What kind of restrictions is there that makes it easier to compute LD on the fly, for instance? Space maybe?
(If there actually does exist databases/tracks of LD, I would be happy to know!)

Best,
Johanne

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20160225/ca0f962e/attachment.html>


More information about the Dev mailing list