[ensembl-dev] Problems with fetch_by_VariationFeatures
andrew126 at mac.com
andrew126 at mac.com
Sat May 6 21:43:05 BST 2017
Hi,
I am using v88 of the API.
I have two variants recovered from REST eQTL data:
rs56391266
http://www.ensembl.org/Homo_sapiens/Variation/Explore?r=11:60376122-60377123;v=rs56391266;vdb=variation;vf=11255007
It has no reported allele frequencies in 1000GENOMES:phase_3 datasets
rs11291652
http://www.ensembl.org/Homo_sapiens/Variation/Explore?db=core;r=11:60360031-60361031;v=rs11291652;vdb=variation;vf=6622904
It is typed in 1000GENOMES:phase_3 datasets
Using script I paste below, I try to calculate r2 between these two variants in 1000GENOMES:phase_3:EUR.
Here is the output of script:
variant 1: rs72643554 60360531 60360531
variant 2: rs72643554 60360531 60360531
r^2 0.180796
Here are my concerns/questions:
*If rs56391266 is not typed in 1000GENOMES:phase_3:EUR (I did not know this before I calculate r2), how can any result be reported?
*Function reports neither of the query variants.
*A single variant is reported twice (rs72643554 .. which lives at same genomic location as rs11291652, but is considered a different variant)
*If function is calculating r2 between variant and itself, why does it not report 1.000?
Please let me know if I am doing anything incorrectly.
Thanks very much.
Best regards,
Andrew
use strict;
$|=1;
use IPC::Run;
use Bio::EnsEMBL::Registry;
use Bio::EnsEMBL::ApiVersion;
printf( "The API version used is %s\n", software_version() );
my $registry = 'Bio::EnsEMBL::Registry';
$registry->load_all();
$registry->set_reconnect_when_lost(1);
$registry->load_registry_from_db(
-host => 'useastdb.ensembl.org', # alternatively ensembldb 'useastdb.ensembl.org'
-user => 'anonymous'
);
my $LD_BASE_RADIUS = 100000;
my $chrom = "11";
my $var1 = "rs56391266";
my $region_start1 = "60376622";
my $region_end1 = "60376623";
my $var2 = "rs11291652";
my $region_start2 = "60360531";
my $region_end2 = "60360531";
my $pop_adaptor = $registry->get_adaptor("human","variation","population");
$pop_adaptor->db->use_vcf(1);
my $ldFeatureContainerAdaptor = $registry->get_adaptor('human', 'variation', 'ldfeaturecontainer'); #get adaptor for LDFeatureContainer object
$ldFeatureContainerAdaptor->db->use_vcf(1);
$ldFeatureContainerAdaptor->max_snp_distance($LD_BASE_RADIUS);
my $variation_adaptor = $registry->get_adaptor( 'human', 'variation', 'variation' );
$variation_adaptor->db->use_vcf(1);
my $population_name = "1000GENOMES:phase_3:EUR";
my $population = $pop_adaptor->fetch_by_name($population_name); #get population object from database
my $variation1 = $variation_adaptor->fetch_by_name("$var1");
my $vfref1 = $variation1->get_all_VariationFeatures();
my $query_variation_feature1;
my $found=0;
foreach my $vf1 (@{$vfref1}) {
if ($chrom eq $vf1->seq_region_name) {
if ($region_start1 eq $vf1->seq_region_start && $region_end1 eq $vf1->seq_region_end) {
$found=1;
$query_variation_feature1 = $vf1;
}
}
}
if ($found==0) {die;}
my $variation2 = $variation_adaptor->fetch_by_name("$var2");
my $vfref2 = $variation2->get_all_VariationFeatures();
my $query_variation_feature2;
$found=0;
foreach my $vf2 (@{$vfref2}) {
if ($chrom eq $vf2->seq_region_name) {
if ($region_start2 eq $vf2->seq_region_start && $region_end2 eq $vf2->seq_region_end) {
$found=1;
$query_variation_feature2 = $vf2;
}
}
}
if ($found==0) {die;}
my $ldFeatureContainer = $ldFeatureContainerAdaptor->fetch_by_VariationFeatures([$query_variation_feature1, $query_variation_feature2],$population); #retrieve all LD values in the region
foreach my $r_square (@{$ldFeatureContainer->get_all_r_square_values}){
if ($r_square->{r2} >= 0.1){
print "variant 1: ".$r_square->{variation1}->variation_name." ".$r_square->{variation1}->seq_region_start." ".$r_square->{variation1}->seq_region_end."\n";
print "variant 2: ".$r_square->{variation2}->variation_name." ".$r_square->{variation2}->seq_region_start." ".$r_square->{variation2}->seq_region_end."\n";
print "r^2 ".$r_square->{r2}."\n";
}
}
More information about the Dev
mailing list