[ensembl-dev] Variant Effect predictor error message (Mark Aquino)

Thu Oct 27 10:51:56 BST 2011

Regarding this issue...

I believe that you are having the same problem I was a week ago.

You installed the latest version of the ensembl database (v64) and you are querying it with v63 of the api.

A week ago I was told there had been an issue with the migration of the website to new hardware and that the API available to download on the ensembl website was really the V63 instead of the v64 version... so when you would install the api and the database the api would query v63 and because it could not find any tables it would complain about the species not being found.

I believe they have now corrected the api on the website... download it again and use it on your scripts.

Cheers

Duarte

----------------------------------------------------------------------

Message: 1
Date: Wed, 26 Oct 2011 08:17:35 -0400
From: Mark Aquino <aquinom85 at me.com>
Subject: Re: [ensembl-dev] Variant Effect predictor error message
To: Will McLaren <wm2 at ebi.ac.uk>
Cc: "dev at ensembl.org" <dev at ensembl.org>
Message-ID: <1CBD4EC6-F6F5-4B86-8F96-00A89FCD7535 at me.com>
Content-Type: text/plain; CHARSET=US-ASCII

Luca--

That's actually the first thing I tried but the error just changed to "human" is not a valid species name for this instance.

Will --

I'll have to go back and double check this but I was using the ensembl and endembl variation APIs and bioperl1.2.3 all from the Ensembl downloads page that essentially lists "pre-requisite files" for the VEP (Redownloaded yesterday so whatever is most up to date via those links) Could be a versioning conflict with bioperl since 1.2.3 is quite old isn't it on 1.6+ now?

I was receiving this error while connecting (trying to) to the ensembl databases rather than my local db since I haven't had time to update them yet. Was using only sift/pph2/condel, format vcf, --no-whole-genome, and a port flag (3306).

Intermediately, as a quick-fix, I just ran my files using vep2.0, the older APIs I had from before, and my locally installed Ensembl database.

PS. On updating the databases locally is there a smarter way to do this than deleting the entire previous build and reloading the new schema and reinserting all the tables fresh, as this is what I have been doing.

Sent from my iPhone

On Oct 26, 2011, at 4:52 AM, Will McLaren <wm2 at ebi.ac.uk> wrote:

> Hi Mark,
>
> Are you using the cache or connecting to the database?
>
> In either case, is your API up to date and/or in sync with the version
> number of the database/cache you are using?
>
> Will
>
> On 26 October 2011 08:51, Venturini Luca <vntlcu41 at univr.it> wrote:
>> I would try using "human" as a species name, usually that functions. Also,
>> set the verbose flag - if the connection to the database functions, you
>> should see a list of all available species/dataset loaded from MySQL.
>>
>> - Luca
>>
>>
>> Il 25/10/2011 16:38, Mark Aquino ha scritto:
>>>
>>> Hi,
>>>
>>> I'm getting this error:
>>>
>>> homo_sapiens is not a valid species name for this instance
>>> homo_sapiens is not a valid species name for this instance
>>> Use of uninitialized value $species in hash element at
>>> /users/maquino/src/ensembl/modules/Bio/EnsEMBL/Registry.pm line 958.
>>> Use of uninitialized value $species in hash element at
>>> /users/maquino/src/ensembl/modules/Bio/EnsEMBL/Registry.pm line 969.
>>> ERROR: Could not connect to core database
>>>
>>> When I try to run the variant effect predictor.
>>>
>>> Is the core database still down and that's what's causing this error or is
>>> there something else going wrong here?
>>>
>>> -Mark
>>>
>>> _______________________________________________
>>> Dev mailing list    Dev at ensembl.org
>>> List admin (including subscribe/unsubscribe):
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> List admin (including subscribe/unsubscribe):
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

------------------------------

Message: 2
Date: Wed, 26 Oct 2011 13:47:33 +0100
From: Andreas Kusalananda K?h?ri <ak at ebi.ac.uk>
Subject: Re: [ensembl-dev] Variant Effect predictor error message
To: Mark Aquino <aquinom85 at me.com>
Cc: dev at ensembl.org
Message-ID:
        <CAD08GDaVva=MFjWh5gkS=CMN6tJt4uJYDUFSLvG9xw_J_f13CA at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

Hi Mark,

Sorry for not spotting your message until now.

Did you get the Ensembl Core API from the tar archive here?:
http://cvs.sanger.ac.uk/cgi-bin/viewvc.cgi/ensembl.tar.gz?root=ensembl&only_with_tag=branch-ensembl-64&view=tar
In that case, I don't know why you're seeing this problem unless
you're pointing the VEP at a server with no release 64 database on it.

Alternatively, if you have checked out the API through CVS, I'm
guessing that this is because you have checked out the HEAD revision
of the Ensembl Core API. This revision of the API will currently look
for release 65 databases on the server that you point it at
(ensembldb.ensembl.org, for example), but since these have not been
release yet it will fail to find any. In this case, try checking out
the 'branch-ensembl-64' branch instead.

That's my initial diagnosis.

AKK

On 25 October 2011 15:38, Mark Aquino <aquinom85 at me.com> wrote:
> Hi,
>
> I'm getting this error:
>
> homo_sapiens is not a valid species name for this instance
> homo_sapiens is not a valid species name for this instance
> Use of uninitialized value $species in hash element at /users/maquino/src/ensembl/modules/Bio/EnsEMBL/Registry.pm line 958.
> Use of uninitialized value $species in hash element at /users/maquino/src/ensembl/modules/Bio/EnsEMBL/Registry.pm line 969.
> ERROR: Could not connect to core database
>
> When I try to run the variant effect predictor.
>
> Is the core database still down and that's what's causing this error or is there something else going wrong here?
>
> -Mark
>
> _______________________________________________
> Dev mailing list ? ?Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>

--
Andreas Kusalananda K?h?ri, Ensembl Software Developer
European Bioinformatics Institute (EMBL-EBI)
Wellcome Trust Genome Campus, Hinxton
Cambridge CB10 1SD, United Kingdom

------------------------------

Message: 3
Date: Wed, 26 Oct 2011 13:22:49 +0100
From: Will McLaren <wm2 at ebi.ac.uk>
Subject: Re: [ensembl-dev] Variant Effect predictor error message
To: Mark Aquino <aquinom85 at me.com>
Cc: "dev at ensembl.org" <dev at ensembl.org>
Message-ID:
        <CAMVEDX3C=KF8hf_YMs-=XR5_hb5gZtKvdE90Xw13j9djYznHvw at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

Hi Mark,

If you're using the public Ensembl DBs, the default port for
ensembldb.ensembl.org is 5306, not 3306.

This might be your problem.

Will

On 26 October 2011 13:17, Mark Aquino <aquinom85 at me.com> wrote:
> Luca--
>
> That's actually the first thing I tried but the error just changed to "human" is not a valid species name for this instance.
>
> Will --
>
> I'll have to go back and double check this but I was using the ensembl and endembl variation APIs and bioperl1.2.3 all from the Ensembl downloads page that essentially lists "pre-requisite files" for the VEP (Redownloaded yesterday so whatever is most up to date via those links) Could be a versioning conflict with bioperl since 1.2.3 is quite old isn't it on 1.6+ now?
>
> I was receiving this error while connecting (trying to) to the ensembl databases rather than my local db since I haven't had time to update them yet. Was using only sift/pph2/condel, format vcf, --no-whole-genome, and a port flag (3306).
>
> Intermediately, as a quick-fix, I just ran my files using vep2.0, the older APIs I had from before, and my locally installed Ensembl database.
>
>
> PS. On updating the databases locally is there a smarter way to do this than deleting the entire previous build and reloading the new schema and reinserting all the tables fresh, as this is what I have been doing.
>
> Sent from my iPhone
>
> On Oct 26, 2011, at 4:52 AM, Will McLaren <wm2 at ebi.ac.uk> wrote:
>
>> Hi Mark,
>>
>> Are you using the cache or connecting to the database?
>>
>> In either case, is your API up to date and/or in sync with the version
>> number of the database/cache you are using?
>>
>> Will
>>
>> On 26 October 2011 08:51, Venturini Luca <vntlcu41 at univr.it> wrote:
>>> I would try using "human" as a species name, usually that functions. Also,
>>> set the verbose flag - if the connection to the database functions, you
>>> should see a list of all available species/dataset loaded from MySQL.
>>>
>>> - Luca
>>>
>>>
>>> Il 25/10/2011 16:38, Mark Aquino ha scritto:
>>>>
>>>> Hi,
>>>>
>>>> I'm getting this error:
>>>>
>>>> homo_sapiens is not a valid species name for this instance
>>>> homo_sapiens is not a valid species name for this instance
>>>> Use of uninitialized value $species in hash element at
>>>> /users/maquino/src/ensembl/modules/Bio/EnsEMBL/Registry.pm line 958.
>>>> Use of uninitialized value $species in hash element at
>>>> /users/maquino/src/ensembl/modules/Bio/EnsEMBL/Registry.pm line 969.
>>>> ERROR: Could not connect to core database
>>>>
>>>> When I try to run the variant effect predictor.
>>>>
>>>> Is the core database still down and that's what's causing this error or is
>>>> there something else going wrong here?
>>>>
>>>> -Mark
>>>>
>>>> _______________________________________________
>>>> Dev mailing list ? ?Dev at ensembl.org
>>>> List admin (including subscribe/unsubscribe):
>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>> Ensembl Blog: http://www.ensembl.info/
>>>
>>>
>>> _______________________________________________
>>> Dev mailing list ? ?Dev at ensembl.org
>>> List admin (including subscribe/unsubscribe):
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>>>
>>
>> _______________________________________________
>> Dev mailing list ? ?Dev at ensembl.org
>> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>

------------------------------

Message: 4
Date: Wed, 26 Oct 2011 21:15:31 +0800
From: Zhang Di <aureliano.jz at gmail.com>
Subject: [ensembl-dev] problems for running lowcoverage annotation
        pipeline
To: dev at ensembl.org
Message-ID:
        <CAMHeD-d2pwK1ynK3KA0Qzo9azOMi3Dzfas366TJ9NeivydyOAQ at mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Hi
  I'm trying to use the lowcoverage annotation pipeline. At first, after
reading docs in ensembl-doc/pipeline-docs, I found that I have to construct
a compara database containing the whole-genome-alignment results. Then I
checkouted ensembl-compara and ensembl-hive, tried to run the wga pipeline
according to the following steps:
  1. downloaded the reference genome mysql files from ensembl, which works
fine.
  2. prepared my interested genome according the guide in
ensembl-doc/loading_sequence_
into_ensembl.txt. still seems fine.
  3. prepared conf file and run comparaLoadGenomes.pl and
loadPairAlignerSystem.pl. the tables in compara database seems fine.
  4. run beekeeper.pl. problem ocurs for the second analysis
ChunkAndGroupDNA, there are two jobs to run, one for the reference genome
and the other for my interested genome. The ref genome loaded correctly
while no data refer to my intersted genome loade in dnafrag, dnafrag_chunk
and dna_collection tables. One line in the job_id_3.out (the job id for
ChunkAndGroupDNA for my interested genome) file is: number of seq_regions 0

  The problem seemed that the pipeline failed to see my genome data. Should
I add some special info to the meta table for my interest genome database?

  Thank you.

--
Zhang Di
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ensembl.org/pipermail/dev/attachments/20111026/783f0bb7/attachment-0001.htm>

------------------------------

Message: 5
Date: Wed, 26 Oct 2011 14:44:42 +0100
From: Kathryn Beal <kbeal at ebi.ac.uk>
Subject: Re: [ensembl-dev] problems for running lowcoverage annotation
        pipeline
To: Zhang Di <aureliano.jz at gmail.com>
Cc: dev at ensembl.org
Message-ID: <4EA80ECA.8050206 at ebi.ac.uk>
Content-Type: text/plain; charset=ISO-8859-1

Hi,
Could you send me the conf file you used? Can you check in the genome_db table that the locator field points
to the location of the core database?

Cheers
Kathryn

> Hi
>   I'm trying to use the lowcoverage annotation pipeline. At first, after
> reading docs in ensembl-doc/pipeline-docs, I found that I have to construct
> a compara database containing the whole-genome-alignment results. Then I
> checkouted ensembl-compara and ensembl-hive, tried to run the wga pipeline
> according to the following steps:
>   1. downloaded the reference genome mysql files from ensembl, which works
> fine.
>   2. prepared my interested genome according the guide in
> ensembl-doc/loading_sequence_
> into_ensembl.txt. still seems fine.
>   3. prepared conf file and run comparaLoadGenomes.pl and
> loadPairAlignerSystem.pl. the tables in compara database seems fine.
>   4. run beekeeper.pl. problem ocurs for the second analysis
> ChunkAndGroupDNA, there are two jobs to run, one for the reference genome
> and the other for my interested genome. The ref genome loaded correctly
> while no data refer to my intersted genome loade in dnafrag, dnafrag_chunk
> and dna_collection tables. One line in the job_id_3.out (the job id for
> ChunkAndGroupDNA for my interested genome) file is: number of seq_regions 0
>
>   The problem seemed that the pipeline failed to see my genome data. Should
> I add some special info to the meta table for my interest genome database?
>
>   Thank you.
>
>
>
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

--
Dr Kathryn Beal
EnsEMBL
EMBL-European Bioinformatics Institute       Tel. +44 (0)1223 494458
Wellcome Trust Genome Campus, Hinxton        Fax. +44 (0)1223 494468
Cambridge CB10 1SD, UK

------------------------------

Message: 6
Date: Thu, 27 Oct 2011 10:30:19 +0100
From: Bronwen Aken <ba1 at sanger.ac.uk>
Subject: Re: [ensembl-dev] problems for running lowcoverage annotation
        pipeline
To: Zhang Di <aureliano.jz at gmail.com>
Cc: dev at ensembl.org
Message-ID: <7BE97079-3E98-467E-800C-73C3B1732C25 at sanger.ac.uk>
Content-Type: text/plain; charset="us-ascii"

Hi there,

Did the set_toplevel.pl script run successfully when you were loading your assembly? This script will add seq_region_attributes to your toplevel sequence regions:

cd ensembl-pipeline/scripts
perl ./set_toplevel.pl -dbhost host -dbuser user -dbname my_db -dbpass ****  -dbport port

Cheers,
Bronwen

On 26 Oct 2011, at 14:15, Zhang Di wrote:

> Hi
>   I'm trying to use the lowcoverage annotation pipeline. At first, after reading docs in ensembl-doc/pipeline-docs, I found that I have to construct a compara database containing the whole-genome-alignment results. Then I checkouted ensembl-compara and ensembl-hive, tried to run the wga pipeline according to the following steps:
>   1. downloaded the reference genome mysql files from ensembl, which works fine.
>   2. prepared my interested genome according the guide in ensembl-doc/loading_sequence_
> into_ensembl.txt. still seems fine.
>   3. prepared conf file and run comparaLoadGenomes.pl and loadPairAlignerSystem.pl. the tables in compara database seems fine.
>   4. run beekeeper.pl. problem ocurs for the second analysis ChunkAndGroupDNA, there are two jobs to run, one for the reference genome and the other for my interested genome. The ref genome loaded correctly while no data refer to my intersted genome loade in dnafrag, dnafrag_chunk and dna_collection tables. One line in the job_id_3.out (the job id for ChunkAndGroupDNA for my interested genome) file is: number of seq_regions 0
>
>   The problem seemed that the pipeline failed to see my genome data. Should I add some special info to the meta table for my interest genome database?
>
>   Thank you.
>
>
> --
> Zhang Di
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ensembl.org/pipermail/dev/attachments/20111027/0adf8c3f/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2058 bytes
Desc: not available
URL: <http://lists.ensembl.org/pipermail/dev/attachments/20111027/0adf8c3f/attachment.bin>

------------------------------

_______________________________________________
Dev mailing list
Dev at ensembl.org
http://lists.ensembl.org/mailman/listinfo/dev

End of Dev Digest, Vol 16, Issue 25
***********************************