[ensembl-dev] Inquiry: Cafe.v3 Input

Mike Montague mmontagu at genome.wustl.edu
Mon Jul 29 23:16:26 BST 2013


Hello, 

I submitted the inquiry (below) to Miguel Pignatelli and he suggested that I forward the question to you. Upon looking at our error messages, he suspects we're having an issue connecting with the database or else it may be a general API issue. I've attached the script that we are attempting to use.

Thank you for your assistance -
Mike Montague
____________

Michael J. Montague, Ph.D.
Post-Doctoral Research Associate
The Genome Institute, Washington University School of Medicine
4444 Forest Park Avenue, Campus Box 8501
St. Louis, MO 63108
T. 718.809.4093

Hello Miguel,

With the release of Cafe.v3, we intend to use your script to obtain data from Ensembl for input into Cafe. We are interested in gene expansion outliers for the cat lineage (ID 9685) relative to Carnivora (ID 33554).

We made a copy of get_expansions.pl (attached below) and added line 10 to call on the Ensembl API here at the Genome Institute. It seems to load fine (line 28).

L10: use lib '/gscmnt/sata206/techd/ensembl_api/v70/ensembl/modules';

The human adapter loads fine (line 33) but we're wondering what it has to do with cat?

L33: my $human_gene_adaptor = $reg->get_adaptor("Homo sapiens", "core", "Gene");

The problem is with line 35, which returns nothing so it crashes in lines 38, 39 and 40.

L35: my $comparaDBA = Bio::EnsEMBL::Registry->get_DBAdaptor('Multi', 'compara');
L38: my $gene_member_adaptor   = $comparaDBA->get_GeneMemberAdaptor;

Error:

test_get_expansions.pl 9685 > cat_expansions.txt

Can't call method "get_GeneMemberAdaptor" on an undefined value at /gscuser/mmontagu/bin/test_get_expansions.pl line 38.
cafe_output$ 

or

perl ~/bin/test_get_expansions.pl 9685 > cat_expansions.txt

DBD::mysql::db selectall_arrayref failed: Can't read dir of '.' (errno: 24) at /gscmnt/sata206/techd/ensembl_api/v70/ensembl/modules/Bio/EnsEMBL/Registry.pm line 1600.

-------------------- WARNING ----------------------
MSG: Homo sapiens is not a valid species name (check DB and API version)
FILE: Bio/EnsEMBL/Registry.pm LINE: 1187
CALLED BY: Bio/EnsEMBL/Registry.pm  LINE: 972
Date (localtime)    = Wed Jul 24 14:14:00 2013
Ensembl API version = 70
---------------------------------------------------

-------------------- EXCEPTION --------------------
MSG: Can not find internal name for species 'Homo sapiens'
STACK Bio::EnsEMBL::Registry::get_adaptor /gscmnt/sata206/techd/ensembl_api/v70/ensembl/modules/Bio/EnsEMBL/Registry.pm:974
STACK toplevel /gscuser/mmontagu/bin/test_get_expansions.pl:33
Date (localtime)    = Wed Jul 24 14:14:00 2013
Ensembl API version = 70
---------------------------------------------------

Mainly, we were wondering if the inputs for lines 33 and 35 are correct for cat? 

Or alternatively, are we calling on an incorrect version of the API? Or, perhaps were overlooking something else altogether?



> From: mp at ebi.ac.uk
> Subject: Re: Inquiry: Cafe.v3 Input
> Date: July 27, 2013 9:28:02 AM CDT
> To: "Mike Montague" <mmontagu at genome.wustl.edu>
> Cc: mp at ebi.ac.uk, "Wes Warren" <wwarren at genome.wustl.edu>


> Dear Mike,
> 
> I am out of the office for some weeks with very limited access to the
> internet at the moment. Looking at your error, it looks like the problem
> is trying to connect with the DB or a general ensembl API problem. I don't
> have at hand the script, but if you can send the interesting parts to
> ensembl-dev at ebi.ac.uk you may be able to find a useful answer in short
> time.
> I will take a look as soon as I can once I am back.
> 
> Cheers,
> 
> M;

> 
> 
> From: Wes Warren <wwarren at genome.wustl.edu>
> Subject: Fwd: Re: Fwd: dN/dS
> Date: July 9, 2013 1:48:47 PM CDT
> To: Mike Montague <mmontagu at genome.wustl.edu>
> 
> Mike,
> 
> The scripts you will need to retrieve cat data for cafe3. For cafe3 you need 1) a data file containing
> gene family sizes for the taxa included in the tree and 2) a Newick formatted tree, including branch lengths for each gene family. 
> 
> Wes
> 
> -------- Original Message -------
> Subject:	Re: Fwd: dN/dS
> Date:	Fri, 08 Mar 2013 17:24:06 +0000
> From:	Miguel Pignatelli <mp at ebi.ac.uk>
> To:	Wes Warren <wwarren at genome.wustl.edu>
> CC:	Javier Herrero <jherrero at ebi.ac.uk>, Matthieu Muffato <muffato at ebi.ac.uk>
> 
> 
> Hi Wes,
> 
> The CAFE data is stored in the compara database and is accessible via 
> the compara API.
> 
> I am sending you a file with the 43 genes expanded in cat with respect 
> to the carnivora taxon.
> 
> The columns in the file are:
> 
> gene_tree_root_id, gene names, number of members in cat, number of 
> members in the parent (carnivora), newick format of the species tree.
> 
> A couple of comments about this data:
> 
> - The newick tree contains the species names and the taxon_ids for the 
> internal nodes. The numbers are the number of members for each taxon.
> - I have only considered significant expansions as reported by our CAFE 
> analysis. It would be possible to include genes in the "grey" zone (i.e. 
> genes with more members in cat vs carnivora, but the change not being 
> significant).
> 
> I am also sending you the script I have written to retrieve the data. It 
> accepts a taxon_id as input. I called it as follows:
> 
> $ perl get_expansions.pl 9685 > cat_expansions.txt
> 
> If you run the script with other taxon_ids, be aware that it also 
> reports genes for which the lowest common ancestor is the taxon_id of 
> interest (i.e. births at the taxon_id of interest).
> 
> Let me know if you have any doubt or need further help in getting the 
> data you want.
> 
> Cheers,
> 
> M;
> 
> 
> On 07/03/13 20:15, Wes Warren wrote:
> > Hi Miguel,
> >
> > Is this data available for cat? All the gene trees are available so I
> > assume so but I am not sure where to find the CAFE files.
> >
> > Thanks,
> > Wes
> >
> > On 3/7/13 12:15 PM, Javier Herrero wrote:
> >> Hi Wes
> >>
> >> This is a question for Miguel, our CAFE expert.
> >>
> >> Javier
> >>
> >> On 06/03/13 17:00, Wes Warren wrote:
> >>> Is the CAFE output available for cat? We are interested in gene
> >>> expansion outliers for the cat lineage, perhaps correlating these
> >>> with carnivora.
> >>>
> >>> Wes
> >>>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20130729/26a69c7b/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test_get_expansions.pl
Type: text/x-perl-script
Size: 3356 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20130729/26a69c7b/attachment.bin>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20130729/26a69c7b/attachment-0001.html>


More information about the Dev mailing list