[ensembl-dev] How to retrieve "Age of Base" using a Perl API?

Tang, Haiming ningzhithm at gmail.com
Mon Sep 29 17:45:00 BST 2014


Hi, Stephen

Thank you very much for you help.

This solves my problem. So column 4 like Ggor-Hsap-Hsap-Pabe[4] stands for
an ancestor of these listed species to which the base has been preserved.

May I also know the script you used to get the tree and alignment info as
seen in your email?

I tried :

"my $mlss =
$mlss_adaptor->fetch_by_method_link_type_species_set_name("EPO", "mammals");

my $slice = $slice_adaptor->fetch_by_region('toplevel', $seq_region,
$seq_region_start, $seq_region_end);
my $genomic_align_blocks =$genomic_align_block_adaptor ->fetch_all_by_

MethodLinkSpeciesSet_Slice($mlss,  $slice);

" to fetch the ancestral sequences.

But it doesn't seem to work.
Thanks
Haiming

On Mon, Sep 29, 2014 at 8:42 AM, Stephen Fitzgerald <stephenf at ebi.ac.uk>
wrote:

> Hi Haiming, column 4 lists the set of species whose ancestor had the same
> base as human (we use a program called Ortheus to infer the sequence of the
> ancestral nodes in the tree connecting all the extant species).
>
> For  example:
>
> chr1    1031796 1031797 Mmul-Panu-Hsap-Ptro[4]  196     50,50,255
>
> The ancestral sequence of the primates present in the alignment at this
> position in human (maked with a "*") is the most recent common ancestor to
> share a G base with human (this is at the root of the 4 primates in the
> alignment). The next deepest ancestor (between rodents and primates, marked
> with a "**") is predicted to have a T at this position. So, somewhere
> between these two ancestors the base changed T->G. Hence, this position
> would be marked as primate specific.
>
>
> Human ›         chromosome:GRCh38:1:1031796:1031797:1
> Ancestral sequences ›   (homo_sapiens,pan_troglodytes);
> Chimpanzee ›    chromosome:CHIMP2.1.4:7:159477370:159477371:1
> Ancestral sequences ›   ((homo_sapiens,pan_troglodytes),(papio_anubis,macaca_mulatta));
> *
> Macaque ›       chromosome:MMUL_1:1:4106934:4106935:1
> Ancestral sequences ›   (papio_anubis,macaca_mulatta);
> Olive baboon ›  scaffold:PapAnu2.0:JH684932.1:192067:192068:1
> Ancestral sequences ›   (((homo_sapiens,pan_troglodytes),(papio_anubis,
> macaca_mulatta)),(mus_musculus,rattus_norvegicus)); **
> Mouse ›         chromosome:GRCm38:4:156188534:156188535:-1
> Ancestral sequences ›   (mus_musculus,rattus_norvegicus);
> Rat ›   chromosome:Rnor_5.0:5:177087882:177087883:-1
> Ancestral sequences ›   ((((homo_sapiens,pan_troglodytes),(papio_anubis,
> macaca_mulatta)),(mus_musculus,rattus_norvegicus)),(
> (sus_scrofa,bos_taurus),canis_familiaris));
> Cow ›   chromosome:UMD3.1:16:52694475:52694476:-1
> Ancestral sequences ›   (sus_scrofa,bos_taurus);
> Pig ›   chromosome:Sscrofa10.2:6:57872690:57872691:-1
> Ancestral sequences ›   ((sus_scrofa,bos_taurus),canis_familiaris);
> Dog ›   chromosome:CanFam3.1:5:56250642:56250643:1
>
>
> Human                G
> Ancestral sequences  G
> Chimpanzee           G
> Ancestral sequences  G *
> Macaque              G
> Ancestral sequences  G
> Olive baboon         G
> Ancestral sequences  T **
> Mouse                C
> Ancestral sequences  C
> Rat                  C
> Ancestral sequences  T
> Cow                  T
> Ancestral sequences  T
> Pig                  G
> Ancestral sequences  T
> Dog                  T
>
>
> We don't store speciation times for the age of base track. Information
> regarding speciation times can be obtained from sites such as Time Tree (
> http://www.timetree.org/).
>
> HTH,
> Stephen.
>
> On Fri, 26 Sep 2014, Tang, Haiming wrote:
>
>  HI, Stephen
>> I followed your instructions and got the bed file.
>>
>> Column 4 appears to list the species for which that base is the same as
>> in human, since it looks like Hsap is in every line.
>> The number in square brackets [] is just the number of species listed.
>>
>> But the file doesn’t seem to give the age of the base.
>>
>> For example: How to interpret Ggor-Hsap-Hsap-Pabe[4] in
>>
>> "chrY 57107125 57107126 Ggor-Hsap-Hsap-Pabe[4] 120 30,30,255"?
>>
>> Are Ggor and Hsap ancestral species?
>>
>> Or Age of base is stored at somewhere else?
>>
>> Thanks
>>
>> Haiming
>>
>> On Fri, Sep 26, 2014 at 2:47 AM, Stephen Fitzgerald <stephenf at ebi.ac.uk>
>> wrote:
>>       Hi Haiming,
>>       the compara API is used to retrieve information from the compara
>> database. However the "Age of Base" track is
>>       generated from a Bigbed binary file, so it is not part of the
>> compara database. The Bigbed file is generated from a
>>       Bed file. I have transferred this Bed file (from release 76) to our
>> ftp site. You can retrieve this file using
>>       anonymous ftp from here:
>>
>>       ftp ftp.ebi.ac.uk
>>
>>       cd pub/software/ensembl/stephen/BaseAge/
>>
>>       get base_age_76.bed.gz
>>
>>       Hope this helps,
>>       Stephen.
>>
>>
>>       On Thu, 25 Sep 2014, Tang, Haiming wrote:
>>
>>
>>             DEAR GROUP, MY NAME IS HAIMING TANG. I'M IN DR PAUL THOMAS'S
>> GROUP IN UNIVERSITY OF SOUTHERN
>>             CALIFORNIA.
>>
>>             I'm trying to retrieve "Age of Base" using Perl API.
>>
>>             As described in "http://www.ensembl.org/info/
>> genome/compara/analyses.html#age_of_base"
>>
>>             "Age of Base
>>
>>             From these ancestral sequences, we infer the age of a base,
>> i.e. the timing of the most recent mutation
>>             for each
>>             base of the genome. Each position of the human genome is
>> compared to its immediate inferred ancestor,
>>             then its
>>             ancestor, etc. until a difference is found. The inferred
>> substitution event therefore occurred on a
>>             specific
>>             branch of the tree, which is identified by all the extant
>> species which eventually descended from that
>>             branch, as
>>             illustrated below."
>>
>>             "Age of base" has close relation with EPO ancestral
>> alignment. But I could find any related method in
>>             Compara Perl
>>             API Documentation or Compara API Tutorial.
>>
>>             Can anyone show me how to do to retrieve "age of base"?
>>
>>             Thank you in advance.
>>
>>             Haiming
>>
>>
>>
>>
>>       _______________________________________________
>>       Dev mailing list    Dev at ensembl.org
>>       Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>>       Ensembl Blog: http://www.ensembl.info/
>>
>>
>>
>>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20140929/12fe6cf9/attachment.html>


More information about the Dev mailing list