[ensembl-dev] Problem with populations
Eduardo Andrés León
eduardo.andres at csic.es
Mon Apr 24 18:29:46 BST 2017
I answer myself because I found the solution. I guess that, apart from the "use_vcf(1)” to access to phase 3 variations, I also need the use_vcf(1) in the population adaptor to obtain subpopulation from the phase 3
you can check the code that is now working:
use strict;
$|=1;
use IPC::Run;
use Bio::EnsEMBL::Registry;
use Bio::EnsEMBL::ApiVersion;
printf STDERR ( "The API version used is %s\n", software_version() );
my $registry = 'Bio::EnsEMBL::Registry';
$registry->load_all();
$registry->set_reconnect_when_lost(1);
$registry->load_registry_from_db(
-host => 'ensembldb.ensembl.org', # alternatively ensembldb 'useastdb.ensembl.org'
-user => 'anonymous'
);
my $snp = "rs79796976";
my $pop_adaptor = $registry->get_adaptor("human","variation","population");
$pop_adaptor->db->use_vcf(1);
my $variation_adaptor = $registry->get_adaptor( 'human', 'variation', 'variation' );
$variation_adaptor->db->use_vcf(1);
my $variation1 = $variation_adaptor->fetch_by_name($snp);
my $vfref1 = $variation1->get_all_VariationFeatures();
my $query_variation_feature1;
my $found=0;
foreach my $vf (@{$vfref1}) {
if($vf->seq_region_name !~ /^HG/){
print "$snp\t" . $vf->display_consequence;
foreach my $allele (@{$vf->get_all_Alleles}){
next unless (defined $allele->population);
my $allele_string = $allele->allele;
my $frequency = $allele->frequency || 'NA';
my $population_name = $allele->population->name;
my $pop = $allele->population;
if($population_name eq "1000GENOMES:phase_3:IBS"){
print "\t$allele_string\t$frequency ($population_name)";
}elsif($population_name eq "1000GENOMES:phase_3:EUR"){
print "\t$allele_string\t$frequency ($population_name)";
}elsif($population_name eq "1000GENOMES:phase_3:ALL"){
print "\t$allele_string\t$frequency ($population_name)";
}else{
next;
}
}
print "\n";
}
}
--------------------------------------------------------------------------------------------------------------------------------------
> On 23 Apr 2017, at 13:25, Eduardo Andrés León <eduardo.andres at csic.es> wrote:
>
>
>
> Dear all,
> I’m using an old script to obtain snp frequencies in different populations/subpolulations. I think it was made one year ago and obviously it is not working anymore. The think is that i made a few changes but I’m not able to update it correctly
>
> I hope any of you could help me.
>
> As an example, what I want is to obtain the frequency for the All population, EUR population and IBS sub-population (all from 1000 genomes):
> http://www.ensembl.org/Homo_sapiens/Variation/Population?db=core;r=6:43769593-43770593;v=rs1570360;vdb=variation;vf=1121901 <http://www.ensembl.org/Homo_sapiens/Variation/Population?db=core;r=6:43769593-43770593;v=rs1570360;vdb=variation;vf=1121901>
>
>
> My code was:
>
> my $variation = $variation_adaptor->fetch_by_name($snp);
> if($variation){
> foreach my $vf (@{$variation_adaptor_feature->fetch_all_by_Variation($variation)}) {
> if($vf->seq_region_name !~ /^HG/){
> print MAP $vf->seq_region_name() ." $snp 0 " $vf->seq_region_start(),"\n";
> print POB "$snp\t" . $vf->display_consequence;
> foreach my $allele (@{$vf->get_all_Alleles}){
> next unless (defined $allele->population);
> my $allele_string = $allele->allele;
> my $frequency = $allele->frequency || 'NA';
> my $population_name = $allele->population->name;
>
> if($population_name eq "1000GENOMES:phase_1_IBS"){
>
> print POB "\t$allele_string\t$frequency";
> }elsif($population_name eq "1000GENOMES:EUR"){
>
> print POB "\t$allele_string\t$frequency";
> }elsif($population_name eq "1000GENOMES:phase_1_ALL"){
>
> print POB "\t$allele_string\t$frequency”;
>
>
> .....
>
> So from the snp, I obtained the alleles and from the allele object I obtain most of the information using the population_name
>
> Now I have tried a couple of things without success
>
> Example1:
>
> my $pa = $registry->get_adaptor("human","variation","population");
> my $pop = $allele->population;
>
> foreach my $super_pop (@{$pa->fetch_all_by_sub_Population($pop)}) {
> print $pop->name(), " is a super population of ", $super_pop->name(), "\n";
> }
>
> # fetch all sub populations of a population
> foreach my $sub_pop (@{$pa->fetch_all_by_super_Population($pop)}) {
> print $sub_pop->name(), " is a sub population of ", $pop->name(), "\n";
> }
>
> Example 2:
>
> my $pop = $allele->population;
> my $sub_pops = $pop->get_all_sub_Populations();
>
> foreach my $sp (@$sub_pops) {
> print
> 'name: ', $sp->name(),
> 'desc: ', $sp->description(),
> 'size: ', $sp->size(),"\n";
> }
>
> Any help ?
>
>
>
> --------------------------------------------------------------------------------------------------------------------------------------
>
> <Signature.jpg>
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20170424/98fd603b/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Signature.jpg
Type: image/jpeg
Size: 37429 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20170424/98fd603b/attachment.jpg>
More information about the Dev
mailing list