[ensembl-dev] Frequencies of SNPS in populations
Duarte Molha
duartemolha at gmail.com
Thu Jan 17 09:54:09 GMT 2019
Dear Developers
I created a simple script to provide me with polymorphic frequencies in the
different populations in the database. However after running it on my set
it seems some variations do not show results
take for example the INDEL rs141080692
When I run it though my script this is the information I get:
rs141080692 GT 1000GENOMES:pilot_1_CEU_low_coverage_panel -
deletion 9 123543905 123543907
rs141080692 - 1000GENOMES:pilot_1_CEU_low_coverage_panel -
deletion 9 123543905 123543907
rs141080692 GT 1000GENOMES:pilot_1_CHB+JPT_low_coverage_panel -
deletion 9 123543905 123543907
rs141080692 - 1000GENOMES:pilot_1_CHB+JPT_low_coverage_panel -
deletion 9 123543905 123543907
rs141080692 GT 1000GENOMES:pilot_1_YRI_low_coverage_panel -
deletion 9 123543905 123543907
rs141080692 - 1000GENOMES:pilot_1_YRI_low_coverage_panel -
deletion 9 123543905 123543907
rs141080692 GT GMI:AK_Koreans - deletion 9
123543905 123543907
rs141080692 - GMI:AK_Koreans - deletion 9
123543905 123543907
rs141080692 GT GMI:NA10851 - deletion 9
123543905 123543907
rs141080692 - GMI:NA10851 - deletion 9
123543905 123543907
rs141080692 GT SSMP:SSM - deletion 9
123543905 123543907
rs141080692 - SSMP:SSM - deletion 9
123543905 123543907
however, looking at the same database in your website:
http://dec2015.archive.ensembl.org/Homo_sapiens/Variation/Population?db=core;r=9:123543406-123544407;v=rs141080692;vdb=variation;vf=127601209
You can see that there is information about its frequency in a whole bunch
of populations
How do I go about fetching these?
My script is pretty basic
first I fect all populations or only ones I am interested in with:
foreach my $pop (@{$population_adaptor->fetch_all()}){
my $name = $pop->name();
if (defined $name){
if (defined $population){
if ($name =~ /\Q$population/){
print STDERR "Selected Populations: $name \n";
push @selected_populations, $name;
}
}else{
print STDERR "Selected Populations: $name \n";
push @selected_populations, $name;
}
}
}
I then use the variation adaptor to get the variation object
my $variation = $variation_adaptor->fetch_by_name($id);
Then I cycle though each variation feature with
foreach my $vf (@{$vf_adaptor->fetch_all_by_Variation($var)}){
my @alleles = @{$vf->get_all_Alleles};
ALLELE_CYCLE:foreach my $a (@alleles){
my $astr = $a->allele();
my $pop = $a->population();
my $pop_name = "-";
if (defined $pop){
$pop_name = $a->population->name() ;
}
my $freq = $a->frequency() || "-";
foreach my $p (@{$selected_populations}){
#print STDERR $pop_name."\t".$p."\n";
if ($pop_name eq $p){
print $out_fh join "\t", ( $var->name(),
$astr,
$pop_name,
$freq,
$varClass,
$chr,
$start,
$end."\n");
next ALLELE_CYCLE;
}
}
}
}
Am I doing something wrong?
There are the phase3 population data for example. They are clealy included
in your site
Many thanks
Duarte
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20190117/b5a40a63/attachment.html>
More information about the Dev
mailing list