[ensembl-dev] GERP Conservation Score via Compara API

Stephen Fitzgerald stephenf at ebi.ac.uk
Wed Aug 20 10:43:22 BST 2014


Hi Mamun, yes, you will get the conservation scores derived from the 38way 
mammals alignment using your code ( using the string "mammals" as a 
parameter to fetch_by_method_link_type_species_set_name. You can also get 
conservation scores from "primates", "sauropsids", "fish" and "amniotes". 
For details, search for the string "species_set_name" here :
http://www.ensembl.org/info/genome/compara/analyses.html )

See below a script, similar to yours (and the output) for getting the 
conservation scores for 10 consecutive positions in human chr12 starting at position 52696077.

Not sure about Biomart, but you can always check the scores on the web 
site (use "configure this page", click on "Conservation regions" and 
enable the appropriate track eg "Conservation score for 38 eutherian 
mammals EPO_LOW_COVERAGE").

HTH,
Cheers,
Stephen.




########## begin script
use strict;
use warnings;
use Data::Dumper;
use Bio::AlignIO;

use Bio::EnsEMBL::Registry;

#Auto-configure the registry
Bio::EnsEMBL::Registry->load_registry_from_db(
         -host=>"ensembldb.ensembl.org", -user=>"anonymous",
         -port=>'5306', -db_version => 76);


# Get the Compara Adaptor for MethodLinkSpeciesSet
my $method_link_species_set_adaptor =
     Bio::EnsEMBL::Registry->get_adaptor(
       "Multi", "compara", "MethodLinkSpeciesSet");

# Get the method_link_species_set for the conservation scores
my $cs_methodLinkSpeciseSet = $method_link_species_set_adaptor->
    fetch_by_method_link_type_species_set_name("GERP_CONSERVATION_SCORE", 
"mammals");

# Define the start and end positions for the alignment
my $single_base_pos = 52696077;

# Get the human *core* Adaptor for Slices
my $human_slice_adaptor =
     Bio::EnsEMBL::Registry->get_adaptor(
       "human", "core", "Slice");

# Get the Compara Adaptor for ConstrainedElements
my $cs_adaptor =
     Bio::EnsEMBL::Registry->get_adaptor(
       "Multi", "compara", "ConservationScore");

for (my$i=0;$i<10;$i++){
  # Get the slice corresponding to the region of interest
  $single_base_pos += $i;
  my $human_slice = $human_slice_adaptor->fetch_by_region(
      "chromosome", "12", $single_base_pos, $single_base_pos);

  # The fetch_all_by_MethodLinkSpeciesSet_Slice() returns a ref.
  # to an array of Conservation score objects (human is the reference 
species)
  my $conservationScores = $cs_adaptor->
      fetch_all_by_MethodLinkSpeciesSet_Slice(
          $cs_methodLinkSpeciseSet, $human_slice, 1);

  # print the conservation score
  foreach my $cs( @{ $conservationScores }) {
   print join(" ", $single_base_pos,$cs->diff_score), "\n";
  }
}

########## end script

output:
52696077 2.07999992370605
52696078 -2.67000007629395
52696080 0.972000002861023
52696083 -0.272000014781952
52696087 0.825999975204468
52696092 -2.1800000667572
52696098 0.794000029563904
52696105 1.1599999666214
52696113 -4.96000003814697
52696122 2.3199999332428


On Wed, 20 Aug 2014, Rashid Mamunur wrote:

> Hello everyone,I have been trying to extract conservation score for single base pair using compare API and following code block. 
> I was wondering if anyone confirm if MethodLinkSpeciesSet is the right way to extract conservation score. 
> 
> Is there any way to double check these scores via Biomart. ? 
> 
> Thanks,
> Mamun
> ================================================================================================================================
> ==
> 
> my $registry = "Bio::EnsEMBL::Registry";
> $registry->load_registry_from_db(
> -host => "ensembldb.ensembl.org",
> -user => "anonymous",
> -port => 5306 );
> 
> my $mlss_adaptor = $registry->get_adaptor("Multi", "compara", "MethodLinkSpeciesSet");
> my $mlss = $mlss_adaptor->fetch_by_method_link_type_species_set_name("GERP_CONSERVATION_SCORE", "mammals");
> 
> #create slice
> my $slice_adaptor = $registry->get_adaptor($species, 'core', 'Slice');
> my $slice = $slice_adaptor->fetch_by_region('toplevel', $seq_region, $seq_region_start, $seq_region_end);
> 
> #get conservation score adaptor
> my $cs_adaptor = $registry->get_adaptor("Multi", 'compara', 'ConservationScore');
> my $display_size = $slice->end - $slice->start + 1; 
> 
> ## Get score array 
> my $scores = $cs_adaptor->fetch_all_by_MethodLinkSpeciesSet_Slice($method_linked_species_Set, $slice, $display_size);
> 
> ================================================================================================================================
> ==
> 
>


More information about the Dev mailing list