[ensembl-dev] Fetch GERP scores of certain genes

Eva Goncalves Serra egs at sanger.ac.uk
Wed Aug 6 12:07:14 BST 2014


Thanks very much Stephen, very helpful!

Cheers,
Eva

On 06/08/2014 12:00, "Stephen Fitzgerald" <stephenf at ebi.ac.uk> wrote:

>Hi Eva, we produce GERP scores for mammals, fish, and sauropsids.
>I guess you are interested in scores for human (GRCh37).
>
>See the perl script (and output) below as an example for fetching the
>scores from the 38 eutherian mammals alignment for the human region
>chr12:52696077-52696082. This is from ensembl release 75.
>
>Please note that release 76 is due out Today and will have a new human
>reference assembly (GRCh38). The example script should work, but will
>give 
>you different results.
>
>If you wish to use GRCh37, you will need to specify the release number in
>the registry configuration, ie.
>
>#Auto-configure the registry
>Bio::EnsEMBL::Registry->load_registry_from_db(
>         -host=>"ensembldb.ensembl.org", -user=>"anonymous",
>         -port=>'5306', -db_version => 75);
>
>Cheers,
>Stephen.
>
>
>
>##################### script begin
>
>use strict;
>use warnings;
>
>use Bio::EnsEMBL::Registry;
>
>#Auto-configure the registry
>Bio::EnsEMBL::Registry->load_registry_from_db(
>         -host=>"ensembldb.ensembl.org", -user=>"anonymous",
>         -port=>'5306');
>
>
># Get the Compara Adaptor for MethodLinkSpeciesSet
>my $method_link_species_set_adaptor =
>     Bio::EnsEMBL::Registry->get_adaptor(
>       "Multi", "compara", "MethodLinkSpeciesSet");
>
># Get the method_link_species_set for the conservation scores
>my $cs_methodLinkSpeciseSet = $method_link_species_set_adaptor->
>    fetch_by_method_link_type_species_set_name("GERP_CONSERVATION_SCORE",
>"mammals");
>
># Define the start and end positions for the alignment
>my ($human_start, $human_end) = (52696077,52696082);
>
># Get the human *core* Adaptor for Slices
>my $human_slice_adaptor =
>     Bio::EnsEMBL::Registry->get_adaptor(
>       "human", "core", "Slice");
>
># Get the slice corresponding to the region of interest
>my $human_slice = $human_slice_adaptor->fetch_by_region(
>     "chromosome", "12", $human_start, $human_end);
>
># Get the Compara Adaptor for Conservation Score
>my $cs_adaptor =
>     Bio::EnsEMBL::Registry->get_adaptor(
>       "Multi", "compara", "ConservationScore");
>
># The fetch_all_by_MethodLinkSpeciesSet_Slice() returns a ref.
># to an array of Conservation score objects (human is the reference
>species)
>my $conservationScores = $cs_adaptor->
>     fetch_all_by_MethodLinkSpeciesSet_Slice(
>         $cs_methodLinkSpeciseSet, $human_slice);
>
># print the conservation scores
>foreach my $cs( @{ $conservationScores }) {
>  print join(" ", $human_start++,$cs->diff_score), "\n";
>}
>
>##################### script end
>
>Output:
>
>52696077 -2.5
>52696078 3.02999997138977
>52696079 4.05999994277954
>52696080 0.703000009059906
>52696081 -0.223000004887581
>52696082 2.83999991416931
>
>
>
>On Wed, 6 Aug 2014, Eva Goncalves Serra wrote:
>
>> Hi,
>> 
>> I am interested in getting GERP scores for bases within the coding
>>sequence of certain genes.
>> Is there a straightforward way get this directly via the API? I.e.,
>>instead of having to map the GERP raw file against coordinates of
>> genes myself.
>> 
>> Cheers,
>> Eva 
>> 
>> 





More information about the Dev mailing list