[ensembl-dev] gene to coords code issue

Sean O'Keeffe so2346 at columbia.edu
Fri Apr 20 22:31:35 BST 2012


Hi Andy,
Ok. It seems the 53 api doesn't load the species databases when I use the
useastdb.ensembl.org host. See below.
When I switch to ensembldb.ensembl.org, I get the species loaded.

The 66 api works for both hosts.

$registry->load_registry_from_db(-host => 'ensembldb.ensembl.org',-user =>
'anonymous', -verbose=>1);
$registry->load_registry_from_db(-host =>
'useastdb.ensembl.org',-port=>'5306',-user
=> 'anonymous', -verbose=>1);

Here's the ens 53 output (useastdb):
>./gene2coords.pl WT_up_genes.txt
Will only load v53 databases
Bio::EnsEMBL::Variation::DBSQL::DBAdaptor module not found so variation
databases will be ignored if found
Bio::EnsEMBL::Funcgen::DBSQL::DBAdaptor module not found so functional
genomics databases will be ignored if found
No Compara databases found
No ancestral database found
No GO database found

And with ensembldb.ensembl.org:
>./gene2coords.pl WT_up_genes.txt
Will only load v53 databases
Species 'saccharomyces_cerevisiae' loaded from database
'saccharomyces_cerevisiae_core_53_1i'
Species 'oryctolagus_cuniculus' loaded from database
'oryctolagus_cuniculus_core_53_1h'
Species 'gorilla_gorilla' loaded from database 'gorilla_gorilla_core_53_1'
Species 'ciona_savignyi' loaded from database 'ciona_savignyi_core_53_2h'
Species 'echinops_telfairi' loaded from database
'echinops_telfairi_core_53_1g'
Species 'myotis_lucifugus' loaded from database
'myotis_lucifugus_core_53_1g'
Species 'taeniopygia_guttata' loaded from database
'taeniopygia_guttata_core_53_1'
Species 'homo_sapiens' loaded from database 'homo_sapiens_core_53_36o'
Species 'dipodomys_ordii' loaded from database 'dipodomys_ordii_core_53_1b'
Species 'sorex_araneus' loaded from database 'sorex_araneus_core_53_1e'
Species 'otolemur_garnettii' loaded from database
'otolemur_garnettii_core_53_1e'
Species 'erinaceus_europaeus' loaded from database
'erinaceus_europaeus_core_53_1e'
Species 'anolis_carolinensis' loaded from database
'anolis_carolinensis_core_53_1'
Species 'canis_familiaris' loaded from database
'canis_familiaris_core_53_2k'
Species 'dasypus_novemcinctus' loaded from database
'dasypus_novemcinctus_core_53_2'
Species 'ornithorhynchus_anatinus' loaded from database
'ornithorhynchus_anatinus_core_53_1j'
Species 'tetraodon_nigroviridis' loaded from database
'tetraodon_nigroviridis_core_53_8b'
Species 'tursiops_truncatus' loaded from database
'tursiops_truncatus_core_53_1b'
Species 'tarsius_syrichta' loaded from database
'tarsius_syrichta_core_53_1b'
Species 'vicugna_pacos' loaded from database 'vicugna_pacos_core_53_1b'
Species 'xenopus_tropicalis' loaded from database
'xenopus_tropicalis_core_53_41m'
Species 'mus_musculus' loaded from database 'mus_musculus_core_53_37f'
Species 'bos_taurus' loaded from database 'bos_taurus_core_53_4c'
Species 'aedes_aegypti' loaded from database 'aedes_aegypti_core_53_1d'
Species 'monodelphis_domestica' loaded from database
'monodelphis_domestica_core_53_5h'
Species 'choloepus_hoffmanni' loaded from database
'choloepus_hoffmanni_core_53_1'
Species 'cavia_porcellus' loaded from database 'cavia_porcellus_core_53_3a'
Species 'anopheles_gambiae' loaded from database
'anopheles_gambiae_core_53_3k'
Species 'rattus_norvegicus' loaded from database
'rattus_norvegicus_core_53_34u'
Species 'takifugu_rubripes' loaded from database
'takifugu_rubripes_core_53_4k'
Species 'caenorhabditis_elegans' loaded from database
'caenorhabditis_elegans_core_53_190'
Species 'pteropus_vampyrus' loaded from database
'pteropus_vampyrus_core_53_1b'
Species 'microcebus_murinus' loaded from database
'microcebus_murinus_core_53_1b'
Species 'ochotona_princeps' loaded from database
'ochotona_princeps_core_53_1c'
Species 'pan_troglodytes' loaded from database 'pan_troglodytes_core_53_21j'
Species 'felis_catus' loaded from database 'felis_catus_core_53_1f'
Species 'equus_caballus' loaded from database 'equus_caballus_core_53_2c'
Species 'procavia_capensis' loaded from database
'procavia_capensis_core_53_1b'
Species 'oryzias_latipes' loaded from database 'oryzias_latipes_core_53_1i'
Species 'macaca_mulatta' loaded from database 'macaca_mulatta_core_53_10k'
Species 'danio_rerio' loaded from database 'danio_rerio_core_53_7e'
Species 'gallus_gallus' loaded from database 'gallus_gallus_core_53_2k'
Species 'tupaia_belangeri' loaded from database
'tupaia_belangeri_core_53_1f'
Species 'ciona_intestinalis' loaded from database
'ciona_intestinalis_core_53_2l'
Species 'loxodonta_africana' loaded from database
'loxodonta_africana_core_53_2'
Species 'spermophilus_tridecemlineatus' loaded from database
'spermophilus_tridecemlineatus_core_53_1g'
Species 'pongo_pygmaeus' loaded from database 'pongo_pygmaeus_core_53_1c'
Species 'drosophila_melanogaster' loaded from database
'drosophila_melanogaster_core_53_54a'
Species 'gasterosteus_aculeatus' loaded from database
'gasterosteus_aculeatus_core_53_1j'
homo_sapiens_cdna_53_36o loaded
mus_musculus_cdna_53_37f loaded
mus_musculus_vega_53_37f loaded
homo_sapiens_vega_53_36o loaded
takifugu_rubripes_otherfeatures_53_4k loaded
danio_rerio_otherfeatures_53_7e loaded
pan_troglodytes_otherfeatures_53_21j loaded
taeniopygia_guttata_otherfeatures_53_1 loaded
rattus_norvegicus_otherfeatures_53_34u loaded
oryzias_latipes_otherfeatures_53_1i loaded
drosophila_melanogaster_otherfeatures_53_54a loaded
saccharomyces_cerevisiae_otherfeatures_53_1i loaded
gallus_gallus_otherfeatures_53_2k loaded
homo_sapiens_otherfeatures_53_36o loaded
xenopus_tropicalis_otherfeatures_53_41m loaded
pongo_pygmaeus_otherfeatures_53_1c loaded
gasterosteus_aculeatus_otherfeatures_53_1j loaded
bos_taurus_otherfeatures_53_4c loaded
tetraodon_nigroviridis_otherfeatures_53_8b loaded
anolis_carolinensis_otherfeatures_53_1 loaded
cavia_porcellus_otherfeatures_53_3a loaded
equus_caballus_otherfeatures_53_2c loaded
macaca_mulatta_otherfeatures_53_10k loaded
canis_familiaris_otherfeatures_53_2k loaded
ciona_savignyi_otherfeatures_53_2h loaded
ornithorhynchus_anatinus_otherfeatures_53_1j loaded
mus_musculus_otherfeatures_53_37f loaded
ciona_intestinalis_otherfeatures_53_2l loaded
anopheles_gambiae_otherfeatures_53_3k loaded
Bio::EnsEMBL::Variation::DBSQL::DBAdaptor module not found so variation
databases will be ignored if found
Bio::EnsEMBL::Funcgen::DBSQL::DBAdaptor module not found so functional
genomics databases will be ignored if found
Bio::EnsEMBL::Compara::DBSQL::DBAdaptor not found so the following compara
databases will be ignored: ensembl_compara_53
ensembl_ancestral_53 loaded
GO software not installed so GO database ensembl_go_53 will be ignored


On 20 April 2012 15:51, Andy Yates <ayates at ebi.ac.uk> wrote:

> Hi Sean,
>
> That is odd. Using the 53 API is the best way to access v53 data. Could
> you can change your code to the following:
>
> $registry->load_registry_from_db(-host => 'ensembldb.ensembl.org', -user
> => 'anonymous', -verbose => 1);
>
> This will emit a lot of debug information about the databases the registry
> can find & send that output back to us. We should be able to debug your
> problem then. Also can you send the latest version of your script please
>
> Many thanks,
>
> Andy
>
> Andrew Yates                   Ensembl Core Software Project Leader
> EMBL-EBI                       Tel: +44-(0)1223-492538
> Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
> Cambridge CB10 1SD, UK         http://www.ensembl.org/
>
> On 20 Apr 2012, at 20:34, Sean O'Keeffe wrote:
>
> > Hi Andy,
> > You are indeed spot on. I am using the ensembl 53 api. Switching to
> ensembl 66 solves the issue.
> > However, I'm trying to extract hg18 coordinates not hg19 - this was why
> I used ensembl_53.
> > What should I do to get these coords?
> >
> > Sean.
> >
> > On 20 April 2012 12:40, Andy Yates <ayates at ebi.ac.uk> wrote:
> > Hi Sean,
> >
> > Normally if you are getting responses saying "can't call method on
> undefined value" points to you using an unreleased API version. Can you
> confirm the version of Ensembl you are using please? Also can you run the
> program ensembl/misc-scripts/ping_ensembl.pl which will attempt to
> diagnose your connection/setup
> >
> > All the best,
> >
> > Andy
> >
> > Andrew Yates                   Ensembl Core Software Project Leader
> > EMBL-EBI                       Tel: +44-(0)1223-492538
> > Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
> > Cambridge CB10 1SD, UK         http://www.ensembl.org/
> >
> > On 20 Apr 2012, at 16:16, Sean O'Keeffe wrote:
> >
> > > Thanks for the response Javier.
> > >
> > > I see the reference to an array of objects and I've implemented this.
> > > However I don't get it. The script dies at the call to
> fetch_all_by_external_name() - Can't call method
> "fetch_all_by_external_name" on an undefined value.
> > > It never gets to implement the loop of array objects. The $variable
> $id is valid and prints out prior to the script dying.
> > >
> > > ...
> > > print $id,"\n";
> > > my $adaptor = $registry->get_adaptor( 'Human', 'Core', 'gene' );
> > >
> > > my $gene = $adaptor->fetch_all_by_external_name($id);
> > >
> > >   foreach $g(@$gene){
> > >     $chr = $g->seq_region_name();
> > >     $start = $g->seq_region_start();
> > >     $end = $g->seq_region_end();
> > >     print OUT join("\t", $chr,$start,$end,$id),"\n";
> > >   }
> > >
> > >
> > > On 20 April 2012 00:49, Javier Herrero <jherrero at ebi.ac.uk> wrote:
> > > Dear Sean
> > >
> > > The method fetch_all_by_external_name returns a reference to an array
> of Bio::EnsEMBL::Gene objects. All the methods named "fetch_all_by..."
> return a reference to an array. The array might be empty or contain just
> one entry, but you will always get a reference to an array. Contrarily, all
> the methods named "fetch_by..." return either undef or 1 single object.
> > >
> > > Typically, you would use a foreach loop to go through all possible
> returned object:
> > >
> > >
> > > open OUT, ">$gene_file.coords";
> > > for my $geneid ( @unique ) {
> > >     chomp $geneid;
> > >     ensembl_coords($geneid);
> > > }
> > >
> > > sub ensembl_coords {
> > >   my ($id) = @_;
> > >
> > >   my $adaptor = $registry->get_adaptor( 'Human', 'Core', 'gene' );
> > >
> > >   my $all_genes = $adaptor->fetch_all_by_external_name($id);
> > >
> > >   foreach my $gene (@$all_genes) {
> > >
> > >     $chr = $gene->seq_region_name();
> > >     $start = $gene->seq_region_start();
> > >     $end = $gene->seq_region_end();
> > >     print OUT join("\t", $chr,$start,$end,$id),"\n"; #I have added the
> original $id here
> > >   }
> > >
> > > }
> > >
> > >
> > > I hope the helps
> > >
> > > Javier
> > >
> > >
> > >
> > >
> > > On 20/04/12 04:49, Sean O'Keeffe wrote:
> > >> Hi,
> > >> I've used the code below on multiple occasions to convert external
> gene names to chromosome coords and it worked fine.
> > >> However when I tried it just now I get the error for the very first
> gene DNAI2 and the script crashes:
> > >>
> > >> Can't call method "seq_region_name" on unblessed reference
> > >>
> > >> When I tried fetch_by_display_label($id) - I get:
> > >>
> > >> Can't call method "seq_region_name" on an undefined value
> > >>
> > >> Have I missed something?
> > >> Thanks for any help,
> > >> Sean.
> > >>
> > >> p.s. I tried connecting to the useastdb.ensembl.org, as I'm in the
> states, but It gave the following (maybe the 2 issues are related):
> > >>
> > >> DBI connect('host=useastdb.ensembl.org;port=3306','anonymous',...)
> failed: Can't connect to MySQL server on 'useastdb.ensembl.org' (111) at
> /home/sean/tools/ensembl_53/modules/Bio/EnsEMBL/Registry.pm line 1329
> > >> Can't call method "selectall_arrayref" on an undefined value at
> /home/sean/tools/ensembl_53/modules/Bio/EnsEMBL/Registry.pm line 1332.
> > >>
> > >> ==============
> > >>
> > >> #!/usr/bin/perl
> > >>
> > >> use strict;
> > >> use lib '/home/sean/tools/ensembl_53/modules';
> > >>
> > >> use Bio::SeqIO;
> > >> use Bio::Root::IO;
> > >> use Bio::EnsEMBL::DBSQL::BaseAdaptor;
> > >> use Bio::EnsEMBL::Registry;
> > >>
> > >> my $registry = 'Bio::EnsEMBL::Registry';
> > >> #$registry->load_registry_from_db(-host => 'useastdb.ensembl.org',-user
> => 'anonymous');
> > >> $registry->load_registry_from_db(-host => 'ensembldb.ensembl.org',-user
> => 'anonymous');
> > >>
> > >> open OUT, ">$gene_file.coords";
> > >> for my $geneid ( @unique ) {
> > >>     chomp $geneid;
> > >>     ($chr,$start, $end) = ensembl_coords($geneid);
> > >>     print OUT join("\t", $chr,$start,$end,$geneid),"\n";
> > >> }
> > >>
> > >> sub ensembl_coords {
> > >>   my ($id) = @_;
> > >>
> > >>   my $adaptor = $registry->get_adaptor( 'Human', 'Core', 'gene' );
> > >>
> > >>   my $gene = $adaptor->fetch_all_by_external_name($id);
> > >>   # my $gene = $adaptor->fetch_by_display_label($id);
> > >>
> > >>   $chr = $gene->seq_region_name();
> > >>   $start = $gene->seq_region_start();
> > >>   $end = $gene->seq_region_end();
> > >>   return ($chr,$start,$end);
> > >>
> > >> }
> > >>
> > >>
> > >> _______________________________________________
> > >> Dev mailing list
> > >> Dev at ensembl.org
> > >>
> > >> List admin (including subscribe/unsubscribe):
> > >> http://lists.ensembl.org/mailman/listinfo/dev
> > >>
> > >> Ensembl Blog:
> > >> http://www.ensembl.info/
> > >
> > > --
> > > Javier Herrero, PhD
> > > Ensembl Coordinator and Ensembl Compara Project Leader
> > > European Bioinformatics Institute (EMBL-EBI)
> > > Wellcome Trust Genome Campus, Hinxton
> > > Cambridge - CB10 1SD - UK
> > >
> > >
> > > _______________________________________________
> > > Dev mailing list    Dev at ensembl.org
> > > List admin (including subscribe/unsubscribe):
> http://lists.ensembl.org/mailman/listinfo/dev
> > > Ensembl Blog: http://www.ensembl.info/
> > >
> > >
> > > _______________________________________________
> > > Dev mailing list    Dev at ensembl.org
> > > List admin (including subscribe/unsubscribe):
> http://lists.ensembl.org/mailman/listinfo/dev
> > > Ensembl Blog: http://www.ensembl.info/
> >
> >
> > _______________________________________________
> > Dev mailing list    Dev at ensembl.org
> > List admin (including subscribe/unsubscribe):
> http://lists.ensembl.org/mailman/listinfo/dev
> > Ensembl Blog: http://www.ensembl.info/
> >
> >
> > _______________________________________________
> > Dev mailing list    Dev at ensembl.org
> > List admin (including subscribe/unsubscribe):
> http://lists.ensembl.org/mailman/listinfo/dev
> > Ensembl Blog: http://www.ensembl.info/
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe):
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20120420/cf6fff9e/attachment.html>


More information about the Dev mailing list