[ensembl-dev] Retrieve gene flanking region question

Gregory Brown gregb at htblis.com
Mon Jan 6 20:17:50 GMT 2014


I am using API v74 and the v74 Human Core db. I obtain genes, transcripts and exons and then export fasta sequences for; gene flanking regions, UTRs, exons, introns.  I have found appropriate methods for everything except a simple method to obtain 5' and 3' gene flanking regions. 

Is there a method in the Core API that is similar to the Variation API method "$var->five_prime_flank_seq" ? I am using $slice_adaptor->fetch_by_region after calculating the desired flanking sequence coordinates given the gene start coordinate.

Here is the code snippet that I am currently using.  The 5' flanking regions are correctly retrieved, but I wonder if there is a better way. Any suggestions are appreciated. Thank you.
Greg Brown

my $flank_size = 200;
my $biotype = 'protein_coding';
my $slice_adaptor = $registry->get_adaptor( 'Human', 'Core', 'Slice' );
my @slices = @{ $slice_adaptor->fetch_all('toplevel') };  # toplevel or chromosome ?
foreach my $slice (@slices) {
   my $genes = $slice->get_all_Genes(undef,undef,1,undef,$biotype);  #  Explicitly load transcripts and specify a biotype (default=protein_coding);
   while ( my $gene = shift @{$genes} ) {
	my $gene_seq_region = $gene->slice->seq_region_name();
       my $gene_start      = $gene->start() ;
       my $gene_end        = $gene->end() ;
       my $gene_strand     = $gene->strand();

	my $up_flank_start  = $gene_start - $flank_size;
       my $up_flank_end    = $gene_start - 1;
       my $down_flank_start = $gene_end + 1 ;
       my $down_flank_end   = $gene_end + $flank_size;

       my ($fiv_flank_start, $fiv_flank_end, $thr_flank_start, $thr_flank_end);
       # Check strand and fix 5' and 3' directions according to strand direction
       if ($gene_strand == "-1") {
           $fiv_flank_start = $down_flank_start;
           $fiv_flank_end   = $down_flank_end;
           $thr_flank_start = $up_flank_start;
           $thr_flank_end   = $up_flank_end;   
       }
       else {
           $fiv_flank_start = $up_flank_start;
           $fiv_flank_end   = $up_flank_end;
           $thr_flank_start = $down_flank_start;
           $thr_flank_end   = $down_flank_end;  
       }

       # There MUST be a better way to get flanking regions?  But this does work
       my $fiv_flank_slice = $slice_adaptor->fetch_by_region( 'toplevel', $gene_seq_region, $fiv_flank_start, $fiv_flank_end, $gene_strand );
       my $fiv_flank_seq   = $fiv_flank_slice->seq();






More information about the Dev mailing list