[ensembl-dev] chain files to build cross-species coordinates conversion (liftover)

Matthieu Muffato muffato at ebi.ac.uk
Mon Apr 13 17:53:47 BST 2015


Dear Juan-Pascal

Even though we do process chains as part of our pairwise-alignment 
pipelines, the files are not kept, and we rather store the alignment in 
a more condensed format in the database: 
http://www.ensembl.org/info/docs/api/compara/compara_schema.html#genomic_align

There are two options at this stage (both using the Perl / REST API): 
either dump the alignment blocks into a chain file to run liftover, or 
only use the API. They both need to write a piece of code.

If you can use the Perl API, there is an example script in the 
ensembl-compara repository that you'll find useful: 
https://github.com/Ensembl/ensembl-compara/blob/release/79/scripts/examples/dna_convertBasePositionsUsingBlastzAlignments.pl
It takes a list of human coordinates and projects them to the chimpanzee 
genome using our alignment (just change the names of the species).

If you are interested in the coordinates of each alignment block, this 
other script can help:
https://github.com/Ensembl/ensembl-compara/blob/release/79/scripts/examples/dna_accessGenomicAlignBlocks.pl

Matthieu

On 13/04/15 17:18, Juan Pascual Anaya wrote:
> Hi Anne,
>
> Thank you very much for your reply!
>
> According to those links that I sent about the pairwise alignments I'm
> interested in, you can read the following:
>
> "After running LastZ, the raw LastZ alignment blocks are chained
> according to their location in both genomes. During the final netting
> process, the best sub-chain is chosen in each region on the reference
> species."
>
> So, I thought that those chain files were already done and available
> somewhere.
>
> I'm using all my genome coordinates from Ensembl (e74), so I guess I
> could also use your Assembly Converter tool, but I would still need the
> chain files, right? Here:
>
> http://www.ensembl.org/info/docs/webcode/mirror/tools/assembly_converter.html
>
> when I try to go to the FTP site where the "current chain files" are,
> the site doesn's exist, so I thought that this method is out of date.
>
> If the Compara team has more info about this, that would be great!
>
> Thanks!
> Juan
>
>
>
> On Mon, Apr 13, 2015 at 8:00 PM, <dev-request at ensembl.org
> <mailto:dev-request at ensembl.org>> wrote:
>
>
>     -------------- next part --------------
>     An HTML attachment was scrubbed...
>     URL:
>     <http://lists.ensembl.org/pipermail/dev/attachments/20150413/ce034369/attachment-0001.htm>
>
>     ------------------------------
>
>     Message: 2
>     Date: Mon, 13 Apr 2015 09:24:21 +0100
>     From: Anne Lyle <annelyle at ebi.ac.uk <mailto:annelyle at ebi.ac.uk>>
>     Subject: Re: [ensembl-dev] chain files to build cross-species
>              coordinates     conversion (liftover)
>     To: Ensembl dev list <dev at ensembl.org <mailto:dev at ensembl.org>>
>     Message-ID: <326CB180-BB1F-4218-90E6-722656215245 at ebi.ac.uk
>     <mailto:326CB180-BB1F-4218-90E6-722656215245 at ebi.ac.uk>>
>     Content-Type: text/plain; charset="utf-8"
>
>     Hi Juan
>
>     We currently only produce chain files (for our own Assembly
>     Converter) for those species where we?ve done an assembly mapping
>     through our core pipeline. Perhaps our compara team can chime in
>     with whether it?s feasible to create them based on pairwise alignments.
>
>     Cheers
>
>     Anne
>
>
>     > On 12 Apr 2015, at 09:42, Juan Pascual Anaya <jpascualanaya at gmail.com <mailto:jpascualanaya at gmail.com>> wrote:
>     >
>     > Hi Ensembl team,
>     >
>     > I am currently doing a comparative analysis of some ChIP-seq datasets between mouse, chicken and turtle (P. sinensis). Therefore, I was interested in getting a cross-species coordinate conversion, like assembly converter or liftover. I've seen that you have these two pairwise alignments done:
>     >
>     > Mouse-Chicken pairwise alignment
>     >http://www.ensembl.org/info/genome/compara/mlss.html?mlss=633
>     <http://www.ensembl.org/info/genome/compara/mlss.html?mlss=633>
>     >
>     > Chicken-Turtle pairwise alignment
>     >http://www.ensembl.org/info/genome/compara/mlss.html?mlss=657
>     <http://www.ensembl.org/info/genome/compara/mlss.html?mlss=657>
>     >
>     > So, I was wondering if I could get the best subset of chains from your LASTZ-net results (i.e., the liftover chain files) so that I can do myself a liftOver. I have not seen this files for download in the Ensembl webpage, so could you let me know how to get them?
>     >
>     > Thank you very much,
>     > Juan
>     >
>     > _______________________________________________
>     > Dev mailing listDev at ensembl.org <mailto:Dev at ensembl.org>
>     > Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>     > Ensembl Blog:http://www.ensembl.info/
>
>     -------------- next part --------------
>     An HTML attachment was scrubbed...
>     URL:
>     <http://lists.ensembl.org/pipermail/dev/attachments/20150413/09f0daa9/attachment-0001.htm>
>
>     ------------------------------
>
>     _______________________________________________
>     Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>     Posting guidelines and subscribe/unsubscribe info:
>     http://lists.ensembl.org/mailman/listinfo/dev
>     Ensembl Blog: http://www.ensembl.info/
>
> --
>
> Juan Pascual-Anaya, PhD
> Research Scientist
> Evolutionary Morphology Laboratory, RIKEN
> 2-2-3 Minatojima-minamimachi
> Chuo-ku, Kobe, Hyogo 650-0047
> Japan
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>

-- 
Matthieu Muffato, Ph.D.
Ensembl Compara and TreeFam Project Leader
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus, Hinxton
Cambridge, CB10 1SD, United Kingdom
Room  A3-145
Phone + 44 (0) 1223 49 4631
Fax   + 44 (0) 1223 49 4468




More information about the Dev mailing list