[ensembl-dev] Retrieve gene flanking region question
Gregory Brown
gregb at htblis.com
Mon Jan 6 20:17:50 GMT 2014
I am using API v74 and the v74 Human Core db. I obtain genes, transcripts and exons and then export fasta sequences for; gene flanking regions, UTRs, exons, introns. I have found appropriate methods for everything except a simple method to obtain 5' and 3' gene flanking regions.
Is there a method in the Core API that is similar to the Variation API method "$var->five_prime_flank_seq" ? I am using $slice_adaptor->fetch_by_region after calculating the desired flanking sequence coordinates given the gene start coordinate.
Here is the code snippet that I am currently using. The 5' flanking regions are correctly retrieved, but I wonder if there is a better way. Any suggestions are appreciated. Thank you.
Greg Brown
my $flank_size = 200;
my $biotype = 'protein_coding';
my $slice_adaptor = $registry->get_adaptor( 'Human', 'Core', 'Slice' );
my @slices = @{ $slice_adaptor->fetch_all('toplevel') }; # toplevel or chromosome ?
foreach my $slice (@slices) {
my $genes = $slice->get_all_Genes(undef,undef,1,undef,$biotype); # Explicitly load transcripts and specify a biotype (default=protein_coding);
while ( my $gene = shift @{$genes} ) {
my $gene_seq_region = $gene->slice->seq_region_name();
my $gene_start = $gene->start() ;
my $gene_end = $gene->end() ;
my $gene_strand = $gene->strand();
my $up_flank_start = $gene_start - $flank_size;
my $up_flank_end = $gene_start - 1;
my $down_flank_start = $gene_end + 1 ;
my $down_flank_end = $gene_end + $flank_size;
my ($fiv_flank_start, $fiv_flank_end, $thr_flank_start, $thr_flank_end);
# Check strand and fix 5' and 3' directions according to strand direction
if ($gene_strand == "-1") {
$fiv_flank_start = $down_flank_start;
$fiv_flank_end = $down_flank_end;
$thr_flank_start = $up_flank_start;
$thr_flank_end = $up_flank_end;
}
else {
$fiv_flank_start = $up_flank_start;
$fiv_flank_end = $up_flank_end;
$thr_flank_start = $down_flank_start;
$thr_flank_end = $down_flank_end;
}
# There MUST be a better way to get flanking regions? But this does work
my $fiv_flank_slice = $slice_adaptor->fetch_by_region( 'toplevel', $gene_seq_region, $fiv_flank_start, $fiv_flank_end, $gene_strand );
my $fiv_flank_seq = $fiv_flank_slice->seq();
More information about the Dev
mailing list