[ensembl-dev] fetch_nearest_Gene_by_Feature

Andy Yates ayates at ebi.ac.uk
Wed Oct 3 15:08:55 BST 2012


Hi Fiona,

As Nathan has already mentioned fetch_nearest_Gene_by_Feature() is a method with a number of known issues however I believe it is working for you but you have mis-constructed your feature object. Ensembl features should be mapped to the whole Slice and not the single BP location as you have used. I've pasted a modified version of your script into the bottom of this email along with the output; this should be closer to what you are expecting to find from the website

Best regards,

Andy

###################

use strict;
use warnings;

use Bio::EnsEMBL::Registry;
Bio::EnsEMBL::Registry->load_registry_from_db(
-HOST => 'ensembldb.ensembl.org',
-PORT => 5306,
-USER => 'anonymous',
);

my $dba = Bio::EnsEMBL::Registry->get_DBAdaptor('human', 'core');
my $slice_adaptor = $dba->get_SliceAdaptor();
my $gene_adaptor = $dba->get_GeneAdaptor();

my @locations = (
[14,1085552,1085552],
[14,20085552,20085552],
[14,20182101,20182101],
[14,52832186,52832186],
[14,82832186,82832186],
[14,101536162,101536162],
[14,112907975,112907975],
[14,252907975,252907975]
);

foreach my $loc (@locations) {
  my ($chr, $start, $end) = @{$loc};
  my $slice = $slice_adaptor->fetch_by_region('toplevel', $chr);
  
  printf("Processing %s:%d..%d\n", $chr, $start, $end);
  my $feature = Bio::EnsEMBL::Feature->new(-start => $start, -end => $end, -strand => 1, -slice => $slice);
  my $genes = $gene_adaptor->fetch_nearest_Gene_by_Feature($feature);
  while( my $gene = shift @{$genes}) {
    print $slice->name() . ":\t".
          $gene->stable_id(). ":". $gene->external_name().":".
          $gene->seq_region_name().":". $gene->seq_region_start().":".
          $gene->seq_region_end().":". $gene->strand()."\n";
  }
}

###################

Processing 14:1085552..1085552
chromosome:GRCh37:14:1:107349540:1:	ENSG00000215398:RP11-754I20.1:14:19109939:19118336:1

Processing 14:20085552..20085552
chromosome:GRCh37:14:1:107349540:1:	ENSG00000258027:RP11-597A11.1:14:20078033:20146082:1

Processing 14:20182101..20182101
chromosome:GRCh37:14:1:107349540:1:	ENSG00000258453:OR11H2:14:20181104:20182079:-1

Processing 14:52832186..52832186
chromosome:GRCh37:14:1:107349540:1:	ENSG00000125384:PTGER2:14:52781023:52795324:1

Processing 14:82832186..82832186
chromosome:GRCh37:14:1:107349540:1:	ENSG00000238978:snoU13:14:82928432:82928532:1

Processing 14:101536162..101536162
chromosome:GRCh37:14:1:107349540:1:	ENSG00000223403:MEG9:14:101536248:101539274:1

Processing 14:112907975..112907975
chromosome:GRCh37:14:1:107349540:1:	ENSG00000253303:IGHVIII-82:14:107287770:107288019:-1

Processing 14:252907975..252907975
chromosome:GRCh37:14:1:107349540:1:	ENSG00000253303:IGHVIII-82:14:107287770:107288019:-1

###################

Andrew Yates                   Ensembl Core Software Project Leader
EMBL-EBI                       Tel: +44-(0)1223-492538
Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
Cambridge CB10 1SD, UK         http://www.ensembl.org/

On 2 Oct 2012, at 16:58, Fiona Nielsen wrote:

> Hi Ensembl team,
> 
> I am implementing a function using
> gene_adaptor->fetch_nearest_Gene_by_Feature, but I am getting some
> strange results.
> 
> After generating a Feature object for the position I am interested in,
> I call fetch_nearest_Gene_by_Feature. Looking at the output I do
> indeed retrieve a Gene object per Feature, but I can tell from the
> location of the gene (and double checking in the genome browser) that
> it is not the closest gene to the Feature position. See output example
> below the code.
> 
> How should I interpret this output if it is not the nearest gene?
> Am I giving the wrong input to the function?
> 
> Any help to solve this mystery is appreciated.
> 
> thanks,
> 
> -Fiona-
> 
> *** code snip ***
>    my $BPslice =
> $slice_adaptor->fetch_by_region('chromosome',$chr,$BP1pos,$BP1pos+1);
>    my $feat = new Bio::EnsEMBL::Feature(
>                                         -start  => $BP1pos,
>                                         -end    => $BP1pos,
>                                         -strand => 1,
>                                         -slice  => $BPslice,
>                                        );
>    print "searching for nearest gene to: $BP1chr $BP1pos\n";
> 
>    my $BP1genelist =
> $gene_adaptor->fetch_nearest_Gene_by_Feature($feat); #searching
> downstream (-1)
>    while( $BP1gene = pop(@{$BP1genelist})) {
>      print $BP1gene->stable_id(). ":". $BP1gene->external_name().":".
> $BP1gene->seq_region_name().":". $BP1gene->seq_region_start().":".
> $BP1gene->seq_region_end().":". $BP1gene->strand()."\n";
>    }
> *** end code snip ***
> 
> *** output example ***
> searching for nearest gene to: chr14 1085552
> ENSG00000215398:RP11-754I20.1:14:19109939:19118336:1
> searching for nearest gene to: chr14 20085552
> ENSG00000258526:RP11-111A21.1:14:39944044:39982984:1
> searching for nearest gene to: chr14 20182101
> ENSG00000258526:RP11-111A21.1:14:39944044:39982984:1
> searching for nearest gene to: chr14 52832186
> ENSG00000185024:BRF1:14:105675623:105781926:-1
> searching for nearest gene to: chr14 82832186
> ENSG00000253303:IGHVIII-82:14:107287770:107288019:-1
> searching for nearest gene to: chr14 101536162
> ENSG00000253303:IGHVIII-82:14:107287770:107288019:-1
> searching for nearest gene to: chr14 112907975
> ENSG00000253303:IGHVIII-82:14:107287770:107288019:-1
> searching for nearest gene to: chr14 252907975
> ENSG00000253303:IGHVIII-82:14:107287770:107288019:-1
> *** end output example ***
> 
> 
> -- 
> http://www.cmbi.ru.nl/~fnielsen
> 
> CMBI 260
> Radboud University Nijmegen-Medical Centre / NCMLS
> Geert-Grooteplein 28
> 6525 GA Nijmegen
> 
> Direct Phone: +31(0)24 36 19774
> CMBI Phone: +31(0)24 36 19390
> CMBI Fax: +31 (0)24 36 19395
> www.ncmls.ru.nl/cmbi
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/





More information about the Dev mailing list