[ensembl-dev] downloading old Ensembl version using Perl API

Thomas Juettemann juettemann at gmail.com
Mon Feb 21 16:03:03 GMT 2011


Thanks again Ian, very helpful pointers.
I'm downloading all genes, transcripts and exons for each chromosome.
Working on release 54 it seems to run quite fast. I did not time it,
but I think getting the information for the current release might be
cutting it close, so I'll add a reconnect half way through.

Thanks again,
Thomas

On Mon, Feb 21, 2011 at 16:37, ian Longden <ianl at ebi.ac.uk> wrote:
> On Mon, Feb 21, 2011 at 3:07 PM, Thomas Juettemann <juettemann at gmail.com> wrote:
>> Thank you Ian.
>> To clarify, were you suggesting that I should install a local version
>> of the 54 MySQL DB or only download the 54 Perl API?
>
> Yep just the API should be okay.
>
>> I did the later, pointed it to db_version 54 and it seems to work. The
>> query is quite complex and will take several hours to complete, would
>> be great to know if I have a bug and query the wrong DB.
>
> How long is long, the ensembl mysql server will kick you off after a
> certain amount of time i think around 8 hours, so if it is more than
> that you may run into trouble.
>
> If you add "-verbose => 1," to the load_registry_from_db call you will
> know straight away what databases you are connecting to.
>
> Depending on what you are doing you may want to split your lob into
> smaller ones to make sure you do not time out. (i.e. one job per
> chromosome, running the script multiple times serially )
>
> It is best not to split the job up and run multiple scripts at the
> same time as this might cause problems for the server as many people
> use this and might be seen as unfair usage.
>
> -Ian.
>
>>
>> Cheers,
>> Thomas
>>
>>
>> On Mon, Feb 21, 2011 at 11:45, ian Longden <ianl at ebi.ac.uk> wrote:
>>> You should install the 54 version of the database if you want to use
>>> the perl API on release 54 databases. You can have multiple APIs
>>> installed you will just need to set the PERL5LIB path for which ever
>>> one you want to use.
>>>
>>> Sometimes you can use different versions of the API and databases but
>>> if there are any schema changes between these versions then the API
>>> may fail, which sounds like the case you have here.
>>>
>>>
>>> -Ian Longden.
>>>
>>> On Mon, Feb 21, 2011 at 8:23 AM, Thomas Juettemann <juettemann at gmail.com> wrote:
>>>> Dear all,
>>>>
>>>> another newbie question.
>>>>
>>>> I'd like to use the Perl API to fetch data from Release 54 (NCBI36/hg18).
>>>> I tried to connect with db_version 54 (I read the disclaimer in API docs):
>>>>
>>>> <code>
>>>>  my $registry = 'Bio::EnsEMBL::Registry';
>>>>  $registry->load_registry_from_db(
>>>>      -host => 'ensembldb.ensembl.org',
>>>>      -user => 'anonymous',
>>>>      -port => 5306,
>>>>      -db_version => 54
>>>>      );
>>>>  return($registry)
>>>> </code>
>>>>
>>>> but fetching a slice fails (2nd line):
>>>>
>>>> <code>
>>>>  my $slice_adaptor  = $registry->get_adaptor( 'Human', 'Core', 'Slice' );
>>>>  my $slice = $slice_adaptor->fetch_by_region( 'chromosome', $chr);
>>>> </code>
>>>>
>>>> Any help/pointers much appreciated.
>>>>
>>>> Best wishes,
>>>> Thomas
>>>>
>>>> _______________________________________________
>>>> Dev mailing list
>>>> Dev at ensembl.org
>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>
>>>
>>
>> _______________________________________________
>> Dev mailing list
>> Dev at ensembl.org
>> http://lists.ensembl.org/mailman/listinfo/dev
>>
>




More information about the Dev mailing list