[ensembl-dev] fetch_nearest_Gene_by_Feature
Andy Yates
ayates at ebi.ac.uk
Wed Oct 3 15:08:55 BST 2012
Hi Fiona,
As Nathan has already mentioned fetch_nearest_Gene_by_Feature() is a method with a number of known issues however I believe it is working for you but you have mis-constructed your feature object. Ensembl features should be mapped to the whole Slice and not the single BP location as you have used. I've pasted a modified version of your script into the bottom of this email along with the output; this should be closer to what you are expecting to find from the website
Best regards,
Andy
###################
use strict;
use warnings;
use Bio::EnsEMBL::Registry;
Bio::EnsEMBL::Registry->load_registry_from_db(
-HOST => 'ensembldb.ensembl.org',
-PORT => 5306,
-USER => 'anonymous',
);
my $dba = Bio::EnsEMBL::Registry->get_DBAdaptor('human', 'core');
my $slice_adaptor = $dba->get_SliceAdaptor();
my $gene_adaptor = $dba->get_GeneAdaptor();
my @locations = (
[14,1085552,1085552],
[14,20085552,20085552],
[14,20182101,20182101],
[14,52832186,52832186],
[14,82832186,82832186],
[14,101536162,101536162],
[14,112907975,112907975],
[14,252907975,252907975]
);
foreach my $loc (@locations) {
my ($chr, $start, $end) = @{$loc};
my $slice = $slice_adaptor->fetch_by_region('toplevel', $chr);
printf("Processing %s:%d..%d\n", $chr, $start, $end);
my $feature = Bio::EnsEMBL::Feature->new(-start => $start, -end => $end, -strand => 1, -slice => $slice);
my $genes = $gene_adaptor->fetch_nearest_Gene_by_Feature($feature);
while( my $gene = shift @{$genes}) {
print $slice->name() . ":\t".
$gene->stable_id(). ":". $gene->external_name().":".
$gene->seq_region_name().":". $gene->seq_region_start().":".
$gene->seq_region_end().":". $gene->strand()."\n";
}
}
###################
Processing 14:1085552..1085552
chromosome:GRCh37:14:1:107349540:1: ENSG00000215398:RP11-754I20.1:14:19109939:19118336:1
Processing 14:20085552..20085552
chromosome:GRCh37:14:1:107349540:1: ENSG00000258027:RP11-597A11.1:14:20078033:20146082:1
Processing 14:20182101..20182101
chromosome:GRCh37:14:1:107349540:1: ENSG00000258453:OR11H2:14:20181104:20182079:-1
Processing 14:52832186..52832186
chromosome:GRCh37:14:1:107349540:1: ENSG00000125384:PTGER2:14:52781023:52795324:1
Processing 14:82832186..82832186
chromosome:GRCh37:14:1:107349540:1: ENSG00000238978:snoU13:14:82928432:82928532:1
Processing 14:101536162..101536162
chromosome:GRCh37:14:1:107349540:1: ENSG00000223403:MEG9:14:101536248:101539274:1
Processing 14:112907975..112907975
chromosome:GRCh37:14:1:107349540:1: ENSG00000253303:IGHVIII-82:14:107287770:107288019:-1
Processing 14:252907975..252907975
chromosome:GRCh37:14:1:107349540:1: ENSG00000253303:IGHVIII-82:14:107287770:107288019:-1
###################
Andrew Yates Ensembl Core Software Project Leader
EMBL-EBI Tel: +44-(0)1223-492538
Wellcome Trust Genome Campus Fax: +44-(0)1223-494468
Cambridge CB10 1SD, UK http://www.ensembl.org/
On 2 Oct 2012, at 16:58, Fiona Nielsen wrote:
> Hi Ensembl team,
>
> I am implementing a function using
> gene_adaptor->fetch_nearest_Gene_by_Feature, but I am getting some
> strange results.
>
> After generating a Feature object for the position I am interested in,
> I call fetch_nearest_Gene_by_Feature. Looking at the output I do
> indeed retrieve a Gene object per Feature, but I can tell from the
> location of the gene (and double checking in the genome browser) that
> it is not the closest gene to the Feature position. See output example
> below the code.
>
> How should I interpret this output if it is not the nearest gene?
> Am I giving the wrong input to the function?
>
> Any help to solve this mystery is appreciated.
>
> thanks,
>
> -Fiona-
>
> *** code snip ***
> my $BPslice =
> $slice_adaptor->fetch_by_region('chromosome',$chr,$BP1pos,$BP1pos+1);
> my $feat = new Bio::EnsEMBL::Feature(
> -start => $BP1pos,
> -end => $BP1pos,
> -strand => 1,
> -slice => $BPslice,
> );
> print "searching for nearest gene to: $BP1chr $BP1pos\n";
>
> my $BP1genelist =
> $gene_adaptor->fetch_nearest_Gene_by_Feature($feat); #searching
> downstream (-1)
> while( $BP1gene = pop(@{$BP1genelist})) {
> print $BP1gene->stable_id(). ":". $BP1gene->external_name().":".
> $BP1gene->seq_region_name().":". $BP1gene->seq_region_start().":".
> $BP1gene->seq_region_end().":". $BP1gene->strand()."\n";
> }
> *** end code snip ***
>
> *** output example ***
> searching for nearest gene to: chr14 1085552
> ENSG00000215398:RP11-754I20.1:14:19109939:19118336:1
> searching for nearest gene to: chr14 20085552
> ENSG00000258526:RP11-111A21.1:14:39944044:39982984:1
> searching for nearest gene to: chr14 20182101
> ENSG00000258526:RP11-111A21.1:14:39944044:39982984:1
> searching for nearest gene to: chr14 52832186
> ENSG00000185024:BRF1:14:105675623:105781926:-1
> searching for nearest gene to: chr14 82832186
> ENSG00000253303:IGHVIII-82:14:107287770:107288019:-1
> searching for nearest gene to: chr14 101536162
> ENSG00000253303:IGHVIII-82:14:107287770:107288019:-1
> searching for nearest gene to: chr14 112907975
> ENSG00000253303:IGHVIII-82:14:107287770:107288019:-1
> searching for nearest gene to: chr14 252907975
> ENSG00000253303:IGHVIII-82:14:107287770:107288019:-1
> *** end output example ***
>
>
> --
> http://www.cmbi.ru.nl/~fnielsen
>
> CMBI 260
> Radboud University Nijmegen-Medical Centre / NCMLS
> Geert-Grooteplein 28
> 6525 GA Nijmegen
>
> Direct Phone: +31(0)24 36 19774
> CMBI Phone: +31(0)24 36 19390
> CMBI Fax: +31 (0)24 36 19395
> www.ncmls.ru.nl/cmbi
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
More information about the Dev
mailing list