[ensembl-dev] Problem with populations

Eduardo Andrés León eduardo.andres at csic.es
Mon Apr 24 18:29:46 BST 2017


I answer myself because I found the solution. I guess that, apart from the "use_vcf(1)” to access to  phase 3 variations, I also need the use_vcf(1) in the population adaptor to obtain subpopulation from the phase 3

you can check the code that is now working:


use strict;
$|=1;
use IPC::Run;
use Bio::EnsEMBL::Registry;
use Bio::EnsEMBL::ApiVersion; 
printf STDERR ( "The API version used is %s\n", software_version() ); 

my $registry = 'Bio::EnsEMBL::Registry';
$registry->load_all();
$registry->set_reconnect_when_lost(1);
$registry->load_registry_from_db(
	-host => 'ensembldb.ensembl.org', # alternatively ensembldb 'useastdb.ensembl.org'
	-user => 'anonymous'
);

my $snp = "rs79796976";

my $pop_adaptor = $registry->get_adaptor("human","variation","population");
$pop_adaptor->db->use_vcf(1);
my $variation_adaptor = $registry->get_adaptor( 'human', 'variation', 'variation' );
$variation_adaptor->db->use_vcf(1);
	
my $variation1 = $variation_adaptor->fetch_by_name($snp);
my $vfref1 = $variation1->get_all_VariationFeatures();
my $query_variation_feature1;
my $found=0;
foreach my $vf (@{$vfref1}) {
	if($vf->seq_region_name !~ /^HG/){

		print "$snp\t" . $vf->display_consequence;
		foreach my $allele (@{$vf->get_all_Alleles}){
			next unless (defined $allele->population);
			my $allele_string   = $allele->allele;
			my $frequency       = $allele->frequency || 'NA';
			my $population_name = $allele->population->name;
			my $pop				= $allele->population;
			
			 if($population_name eq "1000GENOMES:phase_3:IBS"){
				print "\t$allele_string\t$frequency ($population_name)";
			 }elsif($population_name eq "1000GENOMES:phase_3:EUR"){
				print "\t$allele_string\t$frequency ($population_name)";			
			 }elsif($population_name eq "1000GENOMES:phase_3:ALL"){
				print "\t$allele_string\t$frequency ($population_name)";			
			 }else{
				 next;
			 }
		}
		print "\n";
	}
}

--------------------------------------------------------------------------------------------------------------------------------------



> On 23 Apr 2017, at 13:25, Eduardo Andrés León <eduardo.andres at csic.es> wrote:
> 
> 
> 
> Dear all,
> I’m using an old script to obtain snp frequencies in different populations/subpolulations. I think it was made one year ago and obviously it is not working anymore. The think is that i made a few changes but I’m not able to update it correctly
> 
> I hope any of you could help me.
> 
> As an example, what I want is to obtain the frequency for the All population, EUR population and IBS sub-population (all from 1000 genomes):
> http://www.ensembl.org/Homo_sapiens/Variation/Population?db=core;r=6:43769593-43770593;v=rs1570360;vdb=variation;vf=1121901 <http://www.ensembl.org/Homo_sapiens/Variation/Population?db=core;r=6:43769593-43770593;v=rs1570360;vdb=variation;vf=1121901>
> 
> 
> My code was:
> 
> my $variation = $variation_adaptor->fetch_by_name($snp);
> if($variation){
> foreach my $vf (@{$variation_adaptor_feature->fetch_all_by_Variation($variation)}) {
> if($vf->seq_region_name !~ /^HG/){
> print MAP $vf->seq_region_name() ." $snp 0 " $vf->seq_region_start(),"\n";
> print POB "$snp\t" . $vf->display_consequence;
> foreach my $allele (@{$vf->get_all_Alleles}){
> next unless (defined $allele->population);
> my $allele_string   = $allele->allele;
> my $frequency       = $allele->frequency || 'NA';
> my $population_name = $allele->population->name;
> 
>  if($population_name eq "1000GENOMES:phase_1_IBS"){
>  
> print POB "\t$allele_string\t$frequency";
>  }elsif($population_name eq "1000GENOMES:EUR"){
>  
> print POB "\t$allele_string\t$frequency";
>  }elsif($population_name eq "1000GENOMES:phase_1_ALL"){
>  
> print POB "\t$allele_string\t$frequency”;
> 
> 
> .....
> 
> So from the snp, I obtained the alleles and from the allele object I obtain most of the information using the population_name
> 
> Now I have tried a couple of things without success
> 
> Example1:
> 
> my $pa = $registry->get_adaptor("human","variation","population");
> my $pop  = $allele->population;
> 
> foreach my $super_pop (@{$pa->fetch_all_by_sub_Population($pop)}) {
>     print $pop->name(), " is a super population of ", $super_pop->name(), "\n";
>   }
> 
> # fetch all sub populations of a population
>   foreach my $sub_pop (@{$pa->fetch_all_by_super_Population($pop)}) {
>     print $sub_pop->name(), " is a sub population of ", $pop->name(), "\n";
> }
> 
> Example 2:
> 
> my $pop  = $allele->population;
> my $sub_pops = $pop->get_all_sub_Populations();
> 
>         foreach my $sp (@$sub_pops) {
>             print 
>                   'name: ', $sp->name(),
>                   'desc: ', $sp->description(),
>                   'size: ', $sp->size(),"\n";
>   }
> 
> Any help ?
> 
> 
> 
> --------------------------------------------------------------------------------------------------------------------------------------
> 
> <Signature.jpg>
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20170424/98fd603b/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Signature.jpg
Type: image/jpeg
Size: 37429 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20170424/98fd603b/attachment.jpg>


More information about the Dev mailing list