[ensembl-dev] mapping a gene and its exons to the underlying clone

Bert Overduin bert at ebi.ac.uk
Mon Nov 8 19:17:45 GMT 2010


Hello Andrea,

You should be able to map the individual exons to the clone using the
project method. Here is an example script:

#!/usr/bin/perl

use strict;
use warnings;

use Bio::EnsEMBL::Registry;

my $registry = "Bio::EnsEMBL::Registry";

$registry->load_registry_from_db( -host => 'ensembldb.ensembl.org',
-user => 'anonymous' );

my $exon_adaptor = $registry->get_adaptor( 'Human', 'Core', 'Exon' );

my $exon = $exon_adaptor->fetch_by_stable_id( 'ENSE00001184784' );

my $projection = $exon->project( 'clone' );

foreach my $segment ( @{$projection} ) {
        my $to_slice = $segment->to_Slice;
        print
                $exon->stable_id, ":",
                $segment->from_start, "-",
                $segment->from_end, " projects to ",
                $to_slice->coord_system_name, " ",
                $to_slice->seq_region_name, ":",
                $to_slice->start, "-",
                $to_slice->end, "[",
                $to_slice->strand, "]\n";

farm2-head3 /~/scripts/ perl test.pl
ENSE00001184784:1-194 projects to clone AL445212.9:83662-83855[1]

Hope this helps.

Cheers,
Bert

On Mon, Nov 8, 2010 at 6:49 PM, Andrea Edwards <edwardsa at cs.man.ac.uk>wrote:

> Hi
>
> I was trying to find the sequence of a gene and its exons in the underlying
> clone using the ensembl api and not getting very far. I was doing this to
> get used to the api more than anything else and trying to map between a gene
> which is on the reverse strand, and a clone containing the forward strand.
>
> I was able to retrieve the coordinates of the exons in the genomic
> sequence.
>
> exon 1
> start: 128,684,354
> end: 128,684,488
>
> exon 2
> start: 128,664,422
> end: 128,664,818
>
> I have to flip the start and end round in my head
>
> forward  5' ===========================================3'
> reverse  3' ===========================================5'
>
>                 126,664,422-128,664,818    128,684,354 -128,684,488
> <=========E2 <=============E1
>
> But i then was lost how to map these to a clone. I looked for something
> equivalent to the TranscriptMapper but couldn't find anything.
>
> The total length of the sequence (including the intron in the middle is):
> 128,684,488 - 126,664,422 = 202066 bases
>
> I then want to find out how these entire region maps to the underlying
> clone (AC112484.8). I would then like to print out, from the clone, the
> appropriate subsequence in both the forward and the reverse direction.
>
> I hope that makes sense. I don't think it should be difficult but I can't
> find the appropriate methods
>
> Many thanks
>
>
> _______________________________________________
> Dev mailing list
> Dev at ensembl.org
> http://lists.ensembl.org/mailman/listinfo/dev
>



-- 
Bert Overduin, Ph.D.
PANDA Coordination & Outreach

EMBL - European Bioinformatics Institute
Wellcome Trust Genome Campus
Hinxton, Cambridge CB10 1SD
United Kingdom

http://www.ebi.ac.uk/~bert <http://www.ebi.ac.uk/%7Ebert>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20101108/1754badd/attachment.html>


More information about the Dev mailing list