[ensembl-dev] chain files to build cross-species coordinates conversion (liftover)

Juan Pascual Anaya jpascualanaya at gmail.com
Tue Apr 14 15:22:33 BST 2015


Dear Matthieu,

I have some experience using the core API, but didn't know about the
compara schema for this. Both scripts are quite useful for what I need.
Thank you very much!

Best,
Juan

On Tue, Apr 14, 2015 at 8:00 PM, <dev-request at ensembl.org> wrote:

Date: Mon, 13 Apr 2015 17:53:47 +0100
> From: Matthieu Muffato <muffato at ebi.ac.uk>
> Subject: Re: [ensembl-dev] chain files to build cross-species
>         coordinates conversion (liftover)
> To: Ensembl developers list <dev at ensembl.org>
> Message-ID: <552BF49B.5000602 at ebi.ac.uk>
> Content-Type: text/plain; charset=windows-1252; format=flowed
>
> Dear Juan-Pascal
>
> Even though we do process chains as part of our pairwise-alignment
> pipelines, the files are not kept, and we rather store the alignment in
> a more condensed format in the database:
>
> http://www.ensembl.org/info/docs/api/compara/compara_schema.html#genomic_align
>
> There are two options at this stage (both using the Perl / REST API):
> either dump the alignment blocks into a chain file to run liftover, or
> only use the API. They both need to write a piece of code.
>
> If you can use the Perl API, there is an example script in the
> ensembl-compara repository that you'll find useful:
>
> https://github.com/Ensembl/ensembl-compara/blob/release/79/scripts/examples/dna_convertBasePositionsUsingBlastzAlignments.pl
> It takes a list of human coordinates and projects them to the chimpanzee
> genome using our alignment (just change the names of the species).
>
> If you are interested in the coordinates of each alignment block, this
> other script can help:
>
> https://github.com/Ensembl/ensembl-compara/blob/release/79/scripts/examples/dna_accessGenomicAlignBlocks.pl
>
> Matthieu
>
> On 13/04/15 17:18, Juan Pascual Anaya wrote:
> > Hi Anne,
> >
> > Thank you very much for your reply!
> >
> > According to those links that I sent about the pairwise alignments I'm
> > interested in, you can read the following:
> >
> > "After running LastZ, the raw LastZ alignment blocks are chained
> > according to their location in both genomes. During the final netting
> > process, the best sub-chain is chosen in each region on the reference
> > species."
> >
> > So, I thought that those chain files were already done and available
> > somewhere.
> >
> > I'm using all my genome coordinates from Ensembl (e74), so I guess I
> > could also use your Assembly Converter tool, but I would still need the
> > chain files, right? Here:
> >
> >
> http://www.ensembl.org/info/docs/webcode/mirror/tools/assembly_converter.html
> >
> > when I try to go to the FTP site where the "current chain files" are,
> > the site doesn's exist, so I thought that this method is out of date.
> >
> > If the Compara team has more info about this, that would be great!
> >
> > Thanks!
> > Juan
> >
> >
> >
> > On Mon, Apr 13, 2015 at 8:00 PM, <dev-request at ensembl.org
> > <mailto:dev-request at ensembl.org>> wrote:
> >
> >
> >     -------------- next part --------------
> >     An HTML attachment was scrubbed...
> >     URL:
> >     <
> http://lists.ensembl.org/pipermail/dev/attachments/20150413/ce034369/attachment-0001.htm
> >
> >
> >     ------------------------------
> >
> >     Message: 2
> >     Date: Mon, 13 Apr 2015 09:24:21 +0100
> >     From: Anne Lyle <annelyle at ebi.ac.uk <mailto:annelyle at ebi.ac.uk>>
> >     Subject: Re: [ensembl-dev] chain files to build cross-species
> >              coordinates     conversion (liftover)
> >     To: Ensembl dev list <dev at ensembl.org <mailto:dev at ensembl.org>>
> >     Message-ID: <326CB180-BB1F-4218-90E6-722656215245 at ebi.ac.uk
> >     <mailto:326CB180-BB1F-4218-90E6-722656215245 at ebi.ac.uk>>
> >     Content-Type: text/plain; charset="utf-8"
> >
> >     Hi Juan
> >
> >     We currently only produce chain files (for our own Assembly
> >     Converter) for those species where we?ve done an assembly mapping
> >     through our core pipeline. Perhaps our compara team can chime in
> >     with whether it?s feasible to create them based on pairwise
> alignments.
> >
> >     Cheers
> >
> >     Anne
> >
> >
> >     > On 12 Apr 2015, at 09:42, Juan Pascual Anaya <
> jpascualanaya at gmail.com <mailto:jpascualanaya at gmail.com>> wrote:
> >     >
> >     > Hi Ensembl team,
> >     >
> >     > I am currently doing a comparative analysis of some ChIP-seq
> datasets between mouse, chicken and turtle (P. sinensis). Therefore, I was
> interested in getting a cross-species coordinate conversion, like assembly
> converter or liftover. I've seen that you have these two pairwise
> alignments done:
> >     >
> >     > Mouse-Chicken pairwise alignment
> >     >http://www.ensembl.org/info/genome/compara/mlss.html?mlss=633
> >     <http://www.ensembl.org/info/genome/compara/mlss.html?mlss=633>
> >     >
> >     > Chicken-Turtle pairwise alignment
> >     >http://www.ensembl.org/info/genome/compara/mlss.html?mlss=657
> >     <http://www.ensembl.org/info/genome/compara/mlss.html?mlss=657>
> >     >
> >     > So, I was wondering if I could get the best subset of chains from
> your LASTZ-net results (i.e., the liftover chain files) so that I can do
> myself a liftOver. I have not seen this files for download in the Ensembl
> webpage, so could you let me know how to get them?
> >     >
> >     > Thank you very much,
> >     > Juan
> >     >
> >     > _______________________________________________
> >     > Dev mailing listDev at ensembl.org <mailto:Dev at ensembl.org>
> >     > Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> >     > Ensembl Blog:http://www.ensembl.info/
> >
> >     -------------- next part --------------
> >     An HTML attachment was scrubbed...
> >     URL:
> >     <
> http://lists.ensembl.org/pipermail/dev/attachments/20150413/09f0daa9/attachment-0001.htm
> >
> >
> >     ------------------------------
> >
> >     _______________________________________________
> >     Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
> >     Posting guidelines and subscribe/unsubscribe info:
> >     http://lists.ensembl.org/mailman/listinfo/dev
> >     Ensembl Blog: http://www.ensembl.info/
> >
> > --
> >
> > Juan Pascual-Anaya, PhD
> > Research Scientist
> > Evolutionary Morphology Laboratory, RIKEN
> > 2-2-3 Minatojima-minamimachi
> > Chuo-ku, Kobe, Hyogo 650-0047
> > Japan
> >
> >
> > _______________________________________________
> > Dev mailing list    Dev at ensembl.org
> > Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> > Ensembl Blog: http://www.ensembl.info/
> >
>
> --
> Matthieu Muffato, Ph.D.
> Ensembl Compara and TreeFam Project Leader
> European Bioinformatics Institute (EMBL-EBI)
> European Molecular Biology Laboratory
> Wellcome Trust Genome Campus, Hinxton
> Cambridge, CB10 1SD, United Kingdom
> Room  A3-145
> Phone + 44 (0) 1223 49 4631
> Fax   + 44 (0) 1223 49 4468
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150414/e02b071f/attachment.html>


More information about the Dev mailing list