[ensembl-dev] How to retrieve "Age of Base" using a Perl API?
Tang, Haiming
ningzhithm at gmail.com
Mon Sep 29 17:45:00 BST 2014
Hi, Stephen
Thank you very much for you help.
This solves my problem. So column 4 like Ggor-Hsap-Hsap-Pabe[4] stands for
an ancestor of these listed species to which the base has been preserved.
May I also know the script you used to get the tree and alignment info as
seen in your email?
I tried :
"my $mlss =
$mlss_adaptor->fetch_by_method_link_type_species_set_name("EPO", "mammals");
my $slice = $slice_adaptor->fetch_by_region('toplevel', $seq_region,
$seq_region_start, $seq_region_end);
my $genomic_align_blocks =$genomic_align_block_adaptor ->fetch_all_by_
MethodLinkSpeciesSet_Slice($mlss, $slice);
" to fetch the ancestral sequences.
But it doesn't seem to work.
Thanks
Haiming
On Mon, Sep 29, 2014 at 8:42 AM, Stephen Fitzgerald <stephenf at ebi.ac.uk>
wrote:
> Hi Haiming, column 4 lists the set of species whose ancestor had the same
> base as human (we use a program called Ortheus to infer the sequence of the
> ancestral nodes in the tree connecting all the extant species).
>
> For example:
>
> chr1 1031796 1031797 Mmul-Panu-Hsap-Ptro[4] 196 50,50,255
>
> The ancestral sequence of the primates present in the alignment at this
> position in human (maked with a "*") is the most recent common ancestor to
> share a G base with human (this is at the root of the 4 primates in the
> alignment). The next deepest ancestor (between rodents and primates, marked
> with a "**") is predicted to have a T at this position. So, somewhere
> between these two ancestors the base changed T->G. Hence, this position
> would be marked as primate specific.
>
>
> Human › chromosome:GRCh38:1:1031796:1031797:1
> Ancestral sequences › (homo_sapiens,pan_troglodytes);
> Chimpanzee › chromosome:CHIMP2.1.4:7:159477370:159477371:1
> Ancestral sequences › ((homo_sapiens,pan_troglodytes),(papio_anubis,macaca_mulatta));
> *
> Macaque › chromosome:MMUL_1:1:4106934:4106935:1
> Ancestral sequences › (papio_anubis,macaca_mulatta);
> Olive baboon › scaffold:PapAnu2.0:JH684932.1:192067:192068:1
> Ancestral sequences › (((homo_sapiens,pan_troglodytes),(papio_anubis,
> macaca_mulatta)),(mus_musculus,rattus_norvegicus)); **
> Mouse › chromosome:GRCm38:4:156188534:156188535:-1
> Ancestral sequences › (mus_musculus,rattus_norvegicus);
> Rat › chromosome:Rnor_5.0:5:177087882:177087883:-1
> Ancestral sequences › ((((homo_sapiens,pan_troglodytes),(papio_anubis,
> macaca_mulatta)),(mus_musculus,rattus_norvegicus)),(
> (sus_scrofa,bos_taurus),canis_familiaris));
> Cow › chromosome:UMD3.1:16:52694475:52694476:-1
> Ancestral sequences › (sus_scrofa,bos_taurus);
> Pig › chromosome:Sscrofa10.2:6:57872690:57872691:-1
> Ancestral sequences › ((sus_scrofa,bos_taurus),canis_familiaris);
> Dog › chromosome:CanFam3.1:5:56250642:56250643:1
>
>
> Human G
> Ancestral sequences G
> Chimpanzee G
> Ancestral sequences G *
> Macaque G
> Ancestral sequences G
> Olive baboon G
> Ancestral sequences T **
> Mouse C
> Ancestral sequences C
> Rat C
> Ancestral sequences T
> Cow T
> Ancestral sequences T
> Pig G
> Ancestral sequences T
> Dog T
>
>
> We don't store speciation times for the age of base track. Information
> regarding speciation times can be obtained from sites such as Time Tree (
> http://www.timetree.org/).
>
> HTH,
> Stephen.
>
> On Fri, 26 Sep 2014, Tang, Haiming wrote:
>
> HI, Stephen
>> I followed your instructions and got the bed file.
>>
>> Column 4 appears to list the species for which that base is the same as
>> in human, since it looks like Hsap is in every line.
>> The number in square brackets [] is just the number of species listed.
>>
>> But the file doesn’t seem to give the age of the base.
>>
>> For example: How to interpret Ggor-Hsap-Hsap-Pabe[4] in
>>
>> "chrY 57107125 57107126 Ggor-Hsap-Hsap-Pabe[4] 120 30,30,255"?
>>
>> Are Ggor and Hsap ancestral species?
>>
>> Or Age of base is stored at somewhere else?
>>
>> Thanks
>>
>> Haiming
>>
>> On Fri, Sep 26, 2014 at 2:47 AM, Stephen Fitzgerald <stephenf at ebi.ac.uk>
>> wrote:
>> Hi Haiming,
>> the compara API is used to retrieve information from the compara
>> database. However the "Age of Base" track is
>> generated from a Bigbed binary file, so it is not part of the
>> compara database. The Bigbed file is generated from a
>> Bed file. I have transferred this Bed file (from release 76) to our
>> ftp site. You can retrieve this file using
>> anonymous ftp from here:
>>
>> ftp ftp.ebi.ac.uk
>>
>> cd pub/software/ensembl/stephen/BaseAge/
>>
>> get base_age_76.bed.gz
>>
>> Hope this helps,
>> Stephen.
>>
>>
>> On Thu, 25 Sep 2014, Tang, Haiming wrote:
>>
>>
>> DEAR GROUP, MY NAME IS HAIMING TANG. I'M IN DR PAUL THOMAS'S
>> GROUP IN UNIVERSITY OF SOUTHERN
>> CALIFORNIA.
>>
>> I'm trying to retrieve "Age of Base" using Perl API.
>>
>> As described in "http://www.ensembl.org/info/
>> genome/compara/analyses.html#age_of_base"
>>
>> "Age of Base
>>
>> From these ancestral sequences, we infer the age of a base,
>> i.e. the timing of the most recent mutation
>> for each
>> base of the genome. Each position of the human genome is
>> compared to its immediate inferred ancestor,
>> then its
>> ancestor, etc. until a difference is found. The inferred
>> substitution event therefore occurred on a
>> specific
>> branch of the tree, which is identified by all the extant
>> species which eventually descended from that
>> branch, as
>> illustrated below."
>>
>> "Age of base" has close relation with EPO ancestral
>> alignment. But I could find any related method in
>> Compara Perl
>> API Documentation or Compara API Tutorial.
>>
>> Can anyone show me how to do to retrieve "age of base"?
>>
>> Thank you in advance.
>>
>> Haiming
>>
>>
>>
>>
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>>
>>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20140929/12fe6cf9/attachment.html>
More information about the Dev
mailing list