[ensembl-dev] Fetch GERP scores of certain genes
Stephen Fitzgerald
stephenf at ebi.ac.uk
Wed Aug 6 12:00:08 BST 2014
Hi Eva, we produce GERP scores for mammals, fish, and sauropsids.
I guess you are interested in scores for human (GRCh37).
See the perl script (and output) below as an example for fetching the
scores from the 38 eutherian mammals alignment for the human region
chr12:52696077-52696082. This is from ensembl release 75.
Please note that release 76 is due out Today and will have a new human
reference assembly (GRCh38). The example script should work, but will give
you different results.
If you wish to use GRCh37, you will need to specify the release number in
the registry configuration, ie.
#Auto-configure the registry
Bio::EnsEMBL::Registry->load_registry_from_db(
-host=>"ensembldb.ensembl.org", -user=>"anonymous",
-port=>'5306', -db_version => 75);
Cheers,
Stephen.
##################### script begin
use strict;
use warnings;
use Bio::EnsEMBL::Registry;
#Auto-configure the registry
Bio::EnsEMBL::Registry->load_registry_from_db(
-host=>"ensembldb.ensembl.org", -user=>"anonymous",
-port=>'5306');
# Get the Compara Adaptor for MethodLinkSpeciesSet
my $method_link_species_set_adaptor =
Bio::EnsEMBL::Registry->get_adaptor(
"Multi", "compara", "MethodLinkSpeciesSet");
# Get the method_link_species_set for the conservation scores
my $cs_methodLinkSpeciseSet = $method_link_species_set_adaptor->
fetch_by_method_link_type_species_set_name("GERP_CONSERVATION_SCORE", "mammals");
# Define the start and end positions for the alignment
my ($human_start, $human_end) = (52696077,52696082);
# Get the human *core* Adaptor for Slices
my $human_slice_adaptor =
Bio::EnsEMBL::Registry->get_adaptor(
"human", "core", "Slice");
# Get the slice corresponding to the region of interest
my $human_slice = $human_slice_adaptor->fetch_by_region(
"chromosome", "12", $human_start, $human_end);
# Get the Compara Adaptor for Conservation Score
my $cs_adaptor =
Bio::EnsEMBL::Registry->get_adaptor(
"Multi", "compara", "ConservationScore");
# The fetch_all_by_MethodLinkSpeciesSet_Slice() returns a ref.
# to an array of Conservation score objects (human is the reference species)
my $conservationScores = $cs_adaptor->
fetch_all_by_MethodLinkSpeciesSet_Slice(
$cs_methodLinkSpeciseSet, $human_slice);
# print the conservation scores
foreach my $cs( @{ $conservationScores }) {
print join(" ", $human_start++,$cs->diff_score), "\n";
}
##################### script end
Output:
52696077 -2.5
52696078 3.02999997138977
52696079 4.05999994277954
52696080 0.703000009059906
52696081 -0.223000004887581
52696082 2.83999991416931
On Wed, 6 Aug 2014, Eva Goncalves Serra wrote:
> Hi,
>
> I am interested in getting GERP scores for bases within the coding sequence of certain genes.
> Is there a straightforward way get this directly via the API? I.e., instead of having to map the GERP raw file against coordinates of
> genes myself.
>
> Cheers,
> Eva
>
>
>
More information about the Dev
mailing list