[ensembl-dev] compara api: paralogous DNAs

Matthieu Muffato muffato at ebi.ac.uk
Wed Mar 13 12:37:05 GMT 2019


Hi Jinrui

In all our pairwise alignments, we refine the LastZ alignment blocks 
with two steps called "chaining" and "netting" (see 
http://europepmc.org/articles/PMC4852398 and 
http://genomewiki.ucsc.edu/index.php/Chains_Nets for more information). 
What you get in our database is the product of these two steps.
The netting phase is done on the reference species only, we don't do 
bidirectional netting. This means that there is very little overlap / 
nesting on the reference species (human in the case of the human vs * 
alignments). Overlap / nesting is allowed on the non-reference species, 
though. For instance, in the human-mouse alignments, there are 20,000 
pairs of blocks that overlap on human, and 1,900,000 pairs of blocks 
that overlap on mouse.

So in this case, yes you can identify human paralogous regions 1) 
through the self-alignment and 2) through the human-mouse alignment (or 
any pairwise alignment that involves human) by finding human regions 
that align to the same region in the other species

Hope this helps,

Matthieu

On 11/03/2019 19:45, Jin-Rui Xu wrote:
> Hi Matthieu,
>
> Thank you very much for your email.
>
> I am wondering in the human self alignment, one genomic region may be 
> mapped to multiple other regions. These multiple hits also exist in 
> e.g. human vs mouse genome alignment.
> Does ensembl provide all these multiple regions or just the best one? 
> Can these multiple hits achieved by compara perl API?
>
> Thanks!
> Jinrui
>
>
>
>
> On Mon, Mar 11, 2019 at 3:05 PM Matthieu Muffato <muffato at ebi.ac.uk 
> <mailto:muffato at ebi.ac.uk>> wrote:
>
>     Dear Jinrui,
>
>     We have a human self-alignment, that has been computed with LastZ and
>     identifies paralogous regions within the genome. You can find the
>     whole
>     alignment on the FTP
>     ftp://ftp.ensembl.org/pub/current_maf/ensembl-compara/pairwise_alignments/
>
>     but also query specific regions:
>     http://rest.ensembl.org/alignment/region/homo_sapiens/17:63997797-64000390:1?species_set=homo_sapiens;content-type=application/json;method=LASTZ_NET
>
>     Human is the only species for which we have a self-alignment.
>
>     Kind regards,
>     Matthieu
>
>     On 09/03/2019 03:10, Jin-Rui Xu wrote:
>     > Hello,
>     >
>     > I just started learning the compara API. However, I am still not
>     sure
>     > whether it can address my questions. I am wondering if someone
>     could
>     > give me some guidance and example scripts. Here is my question:
>     (1) I
>     > want to identify all paralogous DNA fragments (not neccessarily
>     genes)
>     > in a genome. One genomic regions may have more than one
>     duplicate. (2)
>     > Then, I want to find in which of the other species, the two
>     paralogous
>     > DNAs have a common ancestor.
>     > Alternatively, I can focus on two genomic regions in a genome to
>     test
>     > if they are paralogous, and then which species has their common
>     > ancestral DNA
>     > How could I get this done using compara API (version 95)?
>     >
>     > Many thanks!
>     >
>     > Jinrui
>
>     -- 
>     Matthieu Muffato, Ph.D.
>     Ensembl Compara and TreeFam Project Leader
>     European Bioinformatics Institute (EMBL-EBI)
>     European Molecular Biology Laboratory
>     Wellcome Trust Genome Campus, Hinxton
>     Cambridge, CB10 1SD, United Kingdom
>     Room  A3-145
>     Phone + 44 (0) 1223 49 4631
>     Fax   + 44 (0) 1223 49 4468
>
-- 
Matthieu Muffato, Ph.D.
Ensembl Compara and TreeFam Project Leader
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus, Hinxton
Cambridge, CB10 1SD, United Kingdom
Room  A3-145
Phone + 44 (0) 1223 49 4631
Fax   + 44 (0) 1223 49 4468

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20190313/04022971/attachment.html>


More information about the Dev mailing list