[ensembl-dev] How to retrieve "Age of Base" using a Perl API?
Tang, Haiming
ningzhithm at gmail.com
Mon Sep 29 18:12:55 BST 2014
Thank you very much Matthieu.
Very helpful.
Thanks
Haiming
On Mon, Sep 29, 2014 at 10:02 AM, Matthieu Muffato <muffato at ebi.ac.uk>
wrote:
> Hi Haiming,
>
> You need to access the multiple alignments through the GenomicAlignTree
> objects. They put together the history of the extant regions and their
> ancestral sequences.
>
> You can have a look at the code we're using to generate the AgeOfBase track
> https://github.com/Ensembl/ensembl-compara/blob/release/
> 77/modules/Bio/EnsEMBL/Compara/RunnableDB/BaseAge/BaseAge.pm#L121
> especially those lines:
> - L140: get all the GenomicAlignTree objects
> - L160: iterate over list of trees
> - L190: iterate over the internal nodes of a given tree
> - L195+203: get the ancestral sequence of this node
>
> Hope this helps,
> Matthieu
>
> On 29/09/14 17:45, Tang, Haiming wrote:
>
>> Hi, Stephen
>>
>> Thank you very much for you help.
>>
>> This solves my problem. So column 4 like Ggor-Hsap-Hsap-Pabe[4] stands
>> for an ancestor of these listed species to which the base has been
>> preserved.
>>
>> May I also know the script you used to get the tree and alignment info
>> as seen in your email?
>>
>> I tried :
>>
>> "my $mlss =
>> $mlss_adaptor->fetch_by_method_link_type_species_set_name("EPO",
>> "mammals");
>>
>> my $slice = $slice_adaptor->fetch_by_region('toplevel', $seq_region,
>> $seq_region_start, $seq_region_end);
>>
>> my $genomic_align_blocks =$genomic_align_block_adaptor ->fetch_all_by_
>>
>> MethodLinkSpeciesSet_Slice($mlss, $slice);
>>
>> " to fetch the ancestral sequences.
>>
>> But it doesn't seem to work.
>>
>> Thanks
>> Haiming
>>
>> On Mon, Sep 29, 2014 at 8:42 AM, Stephen Fitzgerald <stephenf at ebi.ac.uk
>> <mailto:stephenf at ebi.ac.uk>> wrote:
>>
>> Hi Haiming, column 4 lists the set of species whose ancestor had the
>> same base as human (we use a program called Ortheus to infer the
>> sequence of the ancestral nodes in the tree connecting all the
>> extant species).
>>
>> For example:
>>
>> chr1 1031796 1031797 Mmul-Panu-Hsap-Ptro[4] 196 50,50,255
>>
>> The ancestral sequence of the primates present in the alignment at
>> this position in human (maked with a "*") is the most recent common
>> ancestor to share a G base with human (this is at the root of the 4
>> primates in the alignment). The next deepest ancestor (between
>> rodents and primates, marked with a "**") is predicted to have a T
>> at this position. So, somewhere between these two ancestors the base
>> changed T->G. Hence, this position would be marked as primate
>> specific.
>>
>>
>> Human › chromosome:GRCh38:1:1031796:__1031797:1
>> Ancestral sequences › (homo_sapiens,pan_troglodytes)__;
>> Chimpanzee › chromosome:CHIMP2.1.4:7:__159477370:159477371:1
>> Ancestral sequences ›
>> ((homo_sapiens,pan___troglodytes),(papio_anubis,__macaca_mulatta));
>> *
>> Macaque › chromosome:MMUL_1:1:4106934:__4106935:1
>> Ancestral sequences › (papio_anubis,macaca_mulatta);
>> Olive baboon › scaffold:PapAnu2.0:JH684932.1:__192067:192068:1
>> Ancestral sequences ›
>> (((homo_sapiens,pan___troglodytes),(papio_anubis,__
>> macaca_mulatta)),(mus___musculus,rattus_norvegicus)); **
>> Mouse › chromosome:GRCm38:4:156188534:__156188535:-1
>> Ancestral sequences › (mus_musculus,rattus___norvegicus);
>> Rat › chromosome:Rnor_5.0:5:__177087882:177087883:-1
>> Ancestral sequences ›
>> ((((homo_sapiens,pan___troglodytes),(papio_anubis,__
>> macaca_mulatta)),(mus___musculus,rattus_norvegicus)),(
>> __(sus_scrofa,bos_taurus),canis___familiaris));
>> Cow › chromosome:UMD3.1:16:52694475:__52694476:-1
>> Ancestral sequences › (sus_scrofa,bos_taurus);
>> Pig › chromosome:Sscrofa10.2:6:__57872690:57872691:-1
>> Ancestral sequences › ((sus_scrofa,bos_taurus),__canis_familiaris);
>> Dog › chromosome:CanFam3.1:5:__56250642:56250643:1
>>
>>
>> Human G
>> Ancestral sequences G
>> Chimpanzee G
>> Ancestral sequences G *
>> Macaque G
>> Ancestral sequences G
>> Olive baboon G
>> Ancestral sequences T **
>> Mouse C
>> Ancestral sequences C
>> Rat C
>> Ancestral sequences T
>> Cow T
>> Ancestral sequences T
>> Pig G
>> Ancestral sequences T
>> Dog T
>>
>>
>> We don't store speciation times for the age of base track.
>> Information regarding speciation times can be obtained from sites
>> such as Time Tree (http://www.timetree.org/).
>>
>> HTH,
>> Stephen.
>>
>> On Fri, 26 Sep 2014, Tang, Haiming wrote:
>>
>> HI, Stephen
>> I followed your instructions and got the bed file.
>>
>> Column 4 appears to list the species for which that base is the
>> same as in human, since it looks like Hsap is in every line.
>> The number in square brackets [] is just the number of species
>> listed.
>>
>> But the file doesn’t seem to give the age of the base.
>>
>> For example: How to interpret Ggor-Hsap-Hsap-Pabe[4] in
>>
>> "chrY 57107125 57107126 Ggor-Hsap-Hsap-Pabe[4] 120 30,30,255"?
>>
>> Are Ggor and Hsap ancestral species?
>>
>> Or Age of base is stored at somewhere else?
>>
>> Thanks
>>
>> Haiming
>>
>> On Fri, Sep 26, 2014 at 2:47 AM, Stephen Fitzgerald
>> <stephenf at ebi.ac.uk <mailto:stephenf at ebi.ac.uk>> wrote:
>> Hi Haiming,
>> the compara API is used to retrieve information from the
>> compara database. However the "Age of Base" track is
>> generated from a Bigbed binary file, so it is not part of
>> the compara database. The Bigbed file is generated from a
>> Bed file. I have transferred this Bed file (from release
>> 76) to our ftp site. You can retrieve this file using
>> anonymous ftp from here:
>>
>> ftp ftp.ebi.ac.uk <http://ftp.ebi.ac.uk>
>>
>> cd pub/software/ensembl/stephen/__BaseAge/
>>
>> get base_age_76.bed.gz
>>
>> Hope this helps,
>> Stephen.
>>
>>
>> On Thu, 25 Sep 2014, Tang, Haiming wrote:
>>
>>
>> DEAR GROUP, MY NAME IS HAIMING TANG. I'M IN DR PAUL
>> THOMAS'S GROUP IN UNIVERSITY OF SOUTHERN
>> CALIFORNIA.
>>
>> I'm trying to retrieve "Age of Base" using Perl API.
>>
>> As described in
>> "http://www.ensembl.org/info/__genome/compara/analyses.html#
>> __age_of_base
>> <http://www.ensembl.org/info/genome/compara/analyses.html#
>> age_of_base>"
>>
>> "Age of Base
>>
>> From these ancestral sequences, we infer the age of
>> a base, i.e. the timing of the most recent mutation
>> for each
>> base of the genome. Each position of the human
>> genome is compared to its immediate inferred ancestor,
>> then its
>> ancestor, etc. until a difference is found. The
>> inferred substitution event therefore occurred on a
>> specific
>> branch of the tree, which is identified by all the
>> extant species which eventually descended from that
>> branch, as
>> illustrated below."
>>
>> "Age of base" has close relation with EPO ancestral
>> alignment. But I could find any related method in
>> Compara Perl
>> API Documentation or Compara API Tutorial.
>>
>> Can anyone show me how to do to retrieve "age of
>> base"?
>>
>> Thank you in advance.
>>
>> Haiming
>>
>>
>>
>>
>> _________________________________________________
>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/__mailman/listinfo/dev
>> <http://lists.ensembl.org/mailman/listinfo/dev>
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>>
>>
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>>
>>
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
> --
> Matthieu Muffato, Ph.D.
> Ensembl Compara Project Leader
> European Bioinformatics Institute (EMBL-EBI)
> European Molecular Biology Laboratory
> Wellcome Trust Genome Campus, Hinxton
> Cambridge, CB10 1SD, United Kingdom
> Room A3-145
> Phone + 44 (0) 1223 49 4631
> Fax + 44 (0) 1223 49 4468
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20140929/8dba004d/attachment.html>
More information about the Dev
mailing list