[ensembl-dev] a question about dN and dS

Matthieu Muffato muffato at ebi.ac.uk
Thu Aug 30 17:16:31 BST 2012


Hi Yuan and Michael

In Compara, homologies always contain exactly 2 genes. For each 
homology, the pairwise protein alignment is sent to PAML (BioPerl 
module: Bio::Tools::Run::Phylo::PAML::Codeml), which returns the values 
of n, s, dn, and ds.
However, I am not familiar with PAML and I cannot tell why n and s are 
decimal values.

Regards,
Matthieu

On 30/08/12 16:59, Yuan Chen wrote:
> So do compara calculate number of synonymous change for the gene in
> paired species ? ie like human is ACG chimp is AGG, then you will count
> as one non-synonymous change, do you store this in database as n ?
>
> if a homology consists more than 2 species, do you calculate multi pair
> species, than average as I do got n and s as decimal number, such as for
> gene : ENSG00000251258 n=543.5 s=245.5 ?
>
> Thanks
>
> yuan
> On 30 Aug 2012, at 15:46, Michael Paulini wrote:
>
>> I have to take that back, it is per pair and therefore the maximum N
>> should be 2*alignment size and is exact. Whereas the dN and dS is
>> estimated through PAML.
>>
>> M
>>
>>
>> On 30/08/12 15:07, Michael Paulini wrote:
>>> as far as I interpret it, it is the total number of nonsynonymous bases
>>> it the alignment, so you can get a maximum of
>>> size-of-alignment-block*number-of-sequences.
>>> It can't be the average, as it returns integers .... that is unless it
>>> rounds them.
>>>
>>> But there is a $homology->dn method, that returns the nonsynonymous
>>> substitution rate ... as in: the average rate per bp.
>>>
>>>
>>> M
>>>
>>> On 30/08/12 14:51, Yuan Chen wrote:
>>>> Dear Michael,
>>>> In document, it said "number of nonsynonymous positions for the homology"
>>>>
>>>> Suppose the homology consist of 10 species, is the average number (i.e total number of nonsynonymous changes devided by number of species) or just total number of nonsynonymous positions ?
>>>>
>>>> As the n or s is not a integer, it's a decimal number, so I thought it would be some kind of average number ?
>>>>
>>>> Thanks
>>>>
>>>> yuan
>>>> On 30 Aug 2012, at 14:17, Michael Paulini wrote:
>>>>
>>>>> Have a look here:http://www.ensembl.org/info/docs/Doxygen/compara-api/classBio_1_1EnsEMBL_1_1Compara_1_1Homology.html
>>>>>
>>>>> there you can find the documentation to the methods.
>>>>>
>>>>> M
>>>>>
>>>>>
>>>>> On 30/08/12 14:05, Yuan Chen wrote:
>>>>>> On the same line, can any one explain what is n and s obtained by :
>>>>>>
>>>>>> $homology->n; $homology->s;
>>>>>>
>>>>>> Is this a number of non_synonymous or synonymous changes for the gene ?
>>>>>>
>>>>>> yuan
>>>>>> On 30 Aug 2012, at 09:16, Matthieu Muffato wrote:
>>>>>>
>>>>>>> Dear Mei
>>>>>>>
>>>>>>> It seems that you are querying a fruit-fly gene. Unfortunately, the dN/dS values are only computed for close enough species: mammals, reptiles, and tetraodontiformes.
>>>>>>>
>>>>>>> Nevertheless, your script is correct and would print some values if you use a human gene as query
>>>>>>>
>>>>>>> Regards,
>>>>>>> Matthieu
>>>>>>>
>>>>>>> On 30/08/12 01:49, JiangMei wrote:
>>>>>>>> Hi All.
>>>>>>>>
>>>>>>>> Sorry to bother you. I'm trying to use ensembl-compara (database version
>>>>>>>> 67) to extract the homologues. I also want to get the dN, dS and dN/dS.
>>>>>>>> However, ENSEMBL can't output these values. Can anyone help me?
>>>>>>>>
>>>>>>>> The following is the script I used:
>>>>>>>>
>>>>>>>> use Bio::EnsEMBL::Registry;
>>>>>>>> my $registry = 'Bio::EnsEMBL::Registry';
>>>>>>>> $registry->load_registry_from_db(
>>>>>>>>        -host       =>'ensembldb.ensembl.org',
>>>>>>>>        -user       =>'anonymous',
>>>>>>>>        -db_version =>'67');
>>>>>>>> my $member_adaptor=$registry->get_adaptor('Multi','compara','Member');
>>>>>>>> my
>>>>>>>> $member=$member_adaptor->fetch_by_source_stable_id('ENSEMBLGENE','FBgn0002780');
>>>>>>>> my $homology_adaptor=$registry->get_adaptor('Multi','compara','Homology');
>>>>>>>> my $homologies=$homology_adaptor->fetch_all_by_Member($member);
>>>>>>>>
>>>>>>>> for $homology(@{$hom ologies}){
>>>>>>>>      for $mem(@{$homology->get_all_Members}){
>>>>>>>>          my $taxon=$mem->taxon; #check Bio::EnsEMBL::Compara::NCBITaxon
>>>>>>>> for methods
>>>>>>>>          my $id=$mem->stable_id;
>>>>>>>>          print "$id\t",$taxon->taxon_id,"\t",$taxon->genus,"
>>>>>>>> ",$taxon->species,"\t";
>>>>>>>>       }
>>>>>>>>      print $homology->description,"\t",$homology->subtype,"\t";
>>>>>>>>      my $dn=$homology->dn;
>>>>>>>>      my $ds=$homology->ds;
>>>>>>>>      my $dnds=$homology->dnds_ratio;
>>>>>>>>      my $lnl=$homology->lnl;
>>>>>>>>      ($dn)?print "$dn\t$ds\t$dnds\t$lnl\n":print OUT "NA\tNA\tNA\tNA\n";
>>>>>>>> }
>>>>>>>>
>>>>>>>>
>>>>>>>> Wish your help! Thanks very much in advance!
>>>>>>>>
>>>>>>>>
>>>>>>>> Best, Mei
>>>>>>>>




More information about the Dev mailing list