[ensembl-dev] Issues with VariationFeature

Johanne Håøy Horn johannhh at ifi.uio.no
Fri Feb 26 13:53:37 GMT 2016


Thank you for the fix and explanation!

I now get a different error message when fetching LDFeatureContainer, however. I have attached the input file of rsIDs I pass to the script. With the script pasted at the end of this email, I get this error:
$ perl expandSNPs.pl ldSNPs.txt
1000GENOMES:phase_3:CEU
after ldfc
after ld values
1000GENOMES:phase_3:CEU
TabixIterator::tabix_iter_free: iter is not of type ti_iter_t at /Users/Johanne/src/ensembl-io/modules/Bio/EnsEMBL/IO/TabixParser.pm line 91, <> line 2.

The error seems to occur if $ldfc_adaptor->fetch_by_VariationFeature($vf, $ld_population); is called more than once, as I get the exact same error message if I use a larger number of SNPs.

__SCRIPT__:
use strict;
use warnings;
use Bio::EnsEMBL::Registry;

my $start_run = time();
open (OUT, ">expandedSNPS.txt");

my $registry = 'Bio::EnsEMBL::Registry';

$registry->load_registry_from_db(
-host => 'ensembldb.ensembl.org<http://ensembldb.ensembl.org>',
-user => 'anonymous'
);

# Connect to the databases:
my $variation_adaptor = $registry->get_adaptor('homo_sapiens', 'variation', 'variation');
my $ldfc_adaptor = $registry->get_adaptor('homo_sapiens', 'variation', 'ldfeaturecontainer');
my $pop_adaptor = $registry->get_adaptor('homo_sapiens', 'variation', 'population');
$variation_adaptor->db->use_vcf(1);

# Print out all available populations:
#my $ld_populations = $pop_adaptor->fetch_all_LD_Populations();
#foreach my $ld_pop (@$ld_populations) {
# print $ld_pop->name, "\n";
#}

my $ld_population = $pop_adaptor->fetch_by_name('1000GENOMES:phase_3:CEU');

# Loop through all SNPs available and find SNPs in LD
while(<>) {
chomp;
my $variation_name = $_;
my $variation = $variation_adaptor->fetch_by_name($variation_name);
my @var_features = @{ $variation->get_all_VariationFeatures() };

foreach my $vf (@var_features) {
my $rsid = $vf->name;
my $start = $vf->start;
my $region = $vf->seq_region_name;
print OUT "$rsid\t$start\t$region\n";
print $ld_population->name, "\n";

my $ldfc = $ldfc_adaptor->fetch_by_VariationFeature($vf, $ld_population);
print "after ldfc\n";
my @ld_values = @{ $ldfc->get_all_ld_values() };
print "after ld values\n";
      foreach my $ld_hash (@ld_values) {
        my $r2 = $ld_hash->{r2};
        my $variation_name1 = $ld_hash->{variation1}->variation_name;
        my $variation_name2 = $ld_hash->{variation2}->variation_name;
        my $pos1 = $ld_hash->{variation1}->seq_region_name;
        my $pos2 = $ld_hash->{variation2}->seq_region_name;
        print OUT "\t$variation_name1 $pos1\t$variation_name2 $pos2\tr2=$r2\n";
      }
}
}

print "done\n";
close OUT;
my $end_run = time();
my $run_time = $end_run - $start_run;
print "Job took $run_time seconds\n»;

Best,
Johanne



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20160226/e7018cf5/attachment.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ldSNPs.txt
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20160226/e7018cf5/attachment.txt>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20160226/e7018cf5/attachment.htm>


More information about the Dev mailing list