[ensembl-dev] LDFeatureContainerAdaptor->fetch_by_VariationFeature is returning unrelated results co-localized with query variation feature

andrew126 at mac.com andrew126 at mac.com
Mon Jan 2 22:57:06 GMT 2017


Hi,

I’m using API v86.

When trying to recover variant features in high-LD with a particular variant via LDFeatureContainerAdaptor->fetch_by_VariationFeature(), results are being returned that do not contain the query variation feature .. instead the extra results appear to belong to variant features that live at the same location as the query variation feature.

For example, querying high-LD pairs to rs111781203 is returning the below results ..

High r2 (0.758709) between variations rs58145932-rs7556897
High r2 (0.807347) between variations rs733693-rs7556897
High r2 (0.810721) between variations rs2063021-rs7556897
High r2 (0.806414) between variations rs2396495-rs7556897
High r2 (0.807347) between variations rs12694756-rs7556897
High r2 (1.000000) between variations rs7556897-rs4973341
High r2 (0.814778) between variations rs548837969-rs7556897
High r2 (0.766439) between variations rs11692411-rs7556897
High r2 (0.810721) between variations rs1395338-rs7556897
High r2 (0.807347) between variations rs8179522-rs7556897
Etc.

rs7556897 happens to live at the same location as (but is considered distinct from) rs111781203.

Is this the expected behavior?

Thanks for any guidance.

Best,

Andrew


The script to recreate this is below:

use strict;
$|=1;
use DBI;
use IPC::Run;
use Bio::EnsEMBL::Registry;
use Bio::EnsEMBL::ApiVersion; 
printf( "The API version used is %s\n", software_version() ); 

# API version is same as Ensembl version, which provides version of gene annotations

my $registry = 'Bio::EnsEMBL::Registry';
$registry->load_all();
$registry->set_reconnect_when_lost(1);
$registry->load_registry_from_db(
    -host => 'useastdb.ensembl.org', # alternatively ensembldb 'useastdb.ensembl.org'
    -user => 'anonymous'
    );

$registry->set_reconnect_when_lost(1);
my ($index_var, $chrom, $index_region_start, $index_region_end) = split /\s+/, "rs111781203 2 227795396 227795396";
my $pop_adaptor = $registry->get_adaptor("human","variation","population");
$pop_adaptor->db->use_vcf(1);
my $population = $pop_adaptor->fetch_by_name("1000GENOMES:phase_3:EUR");

my $ldFeatureContainerAdaptor = $registry->get_adaptor('human', 'variation', 'ldfeaturecontainer'); #get adaptor for LDFeatureContainer object
$ldFeatureContainerAdaptor->db->use_vcf(1);
$ldFeatureContainerAdaptor->max_snp_distance(50000);

print "index variant: $index_var $chrom $index_region_start\-$index_region_end\n";

my $variation_adaptor = $registry->get_adaptor( 'human', 'variation', 'variation' );
$variation_adaptor->db->use_vcf(1);
my $query_variation_feature;
my $variation = $variation_adaptor->fetch_by_name("$index_var");
my $vfref = $variation->get_all_VariationFeatures();
foreach my $vf (@{$vfref}) {
    if ($vf->variation_name eq $index_var) {
	if ($chrom eq $vf->seq_region_name) {
	    if ($index_region_start eq $vf->seq_region_start && $index_region_end eq $vf->seq_region_end) {
		$query_variation_feature = $vf;
	    }
	}
    }
}

my $ldFeatureContainer = $ldFeatureContainerAdaptor->fetch_by_VariationFeature($query_variation_feature,$population);
	
foreach my $r_square (@{$ldFeatureContainer->get_all_r_square_values}){
    if ($r_square->{r2} >= 0.6){ 
	print "High r2 (".($r_square->{r2}).") between variations ", $r_square->{variation1}->variation_name, "-", $r_square->{variation2}->variation_name, "\n";
    }
}








More information about the Dev mailing list