[ensembl-dev] BioMart Ensembl version

Guillermo Marco Puche guillermo.marco at sistemasgenomicos.com
Wed Sep 24 12:20:45 BST 2014


Sorry forgot to paste error:

Processing Cached Registry: 
/share/apps/local/biomart-perl/conf/cachedRegistries/martURLLocation_ensembl_76_hg38.xml.cached

Attribute 'external_gene_id' not found in dataset 
default.hsapiens_gene_ensembl

Trace begun at /share/apps/local/biomart-perl/lib/BioMart/Registry.pm 
line 490
BioMart::Registry::getAttribute('BioMart::Registry=HASH(0x12746720)', 
'hsapiens_gene_ensembl', 'external_gene_id', 'default', 'default') 
called at /share/apps/local/biomart-perl/lib/BioMart/Query.pm line 1243
BioMart::Query::addAttribute('BioMart::Query=HASH(0x2af76807f600)', 
'external_gene_id') called at local.pl line 892

I've tried with both Biomart 0.6 and 0,7 versions without luck on 
Ensembl 76.

On 24/09/14 13:13, Guillermo Marco Puche wrote:
> Dear Thomas,
>
> I'm experiencing an error when querying to BioMart Ensembl 76 registry 
> for both clean/cached.
> Using older Mart Registry to ensembl 75 registry  archive works for 
> both clean/cached.
>
>     my $action='clean';
>
>             my $initializer =
>         BioMart::Initializer->new('registryFile'=>$confFile,
>         'action'=>$action);
>             my $registry = $initializer->getRegistry;
>
>             my $query =
>         BioMart::Query->new('registry'=>$registry,'virtualSchemaName'=>'default');
>
>                 $query->setDataset("hsapiens_gene_ensembl");
>                 if ( length($gene_name)==15 && substr($gene_name,0,4)
>         eq "ENSG" ) { $query->addFilter("ensembl_gene_id", @{$gene}); }
>                 else { $query->addFilter("hgnc_symbol", @{$gene}); }
>                 if ($query_type eq "gene_info"){
>                     #$query->addFilter("biotype", ["protein_coding"]);
>                     $query->addAttribute("ensembl_gene_id");
>                     $query->addAttribute("ensembl_transcript_id");
>                     $query->addAttribute("chromosome_name");
>         *$query->addAttribute("external_gene_id");*
>                 }
>
> Was this removed in Ensembl 76? Is there any documentation about 
> BioMart available queries per Ensembl release?
>
> Thank you and sorry for spam, I never had this issues when changing 
> Ensembl previously.
>
> Best regards,
> Guillermo.
>
>
> On 23/09/14 15:10, Guillermo Marco Puche wrote:
>> Dear Thomas,
>>
>> Ok I didn't understand correctly, now I do.
>> As you say biomart website registry is still showing ensembl 75.
>>
>> Thank you.
>>
>> Best regards,
>> Guillermo.
>>
>> On 23/09/14 14:01, Thomas Maurel wrote:
>>> Dear Guillermo,
>>>
>>> The biomart.org <http://biomart.org> website seems to be very slow 
>>> at the moment and I am afraid the website is still displaying our 
>>> release 75 marts on hg19 (GRCh37). According to the biomart.org 
>>> <http://biomart.org> mart registry page: 
>>> http://www.biomart.org/biomart/martservice?type=registry, port '80' 
>>> is still valid.
>>> If you want to use hg38 (GRCh38), the best way would be to point 
>>> your mart XML config to the ensembl.org <http://ensembl.org> website 
>>> in order to access our release 76 mart databases on hg38 (GRCh38).
>>> You can follow the previous instruction but just change the mart 
>>> registry URL page to get the mart release 76 registry informations:
>>>
>>>>> 2) A BioMart perl script
>>>>> a) You first need to edit your configuration file in 
>>>>> "biomart-perl/conf/martURLLocation.xml" and paste the content of 
>>>>> the mart registry page for the
>>> Ensembl website on release 76 (hg38): 
>>> http://www.ensembl.org/biomart/martservice?type=registry
>>>>> b) Then edit your script and make sure that "my $confFile" 
>>>>> variable is looking at the martURLLocation.xml configuration file 
>>>>> in biomart-perl/conf
>>>>> c) Finally, make sure to update the following line in your script:
>>>>>
>>>>> my $action='cached';
>>>>>
>>>>> with:
>>>>>
>>>>> my $action='clean';
>>>>>
>>>>> The first run of your script on Ensembl 75 might be a bit slow as 
>>>>> BioMart will cache some data from the BioMart website.
>>>>> Once you have run your script with the action variable set to 
>>>>> "clean", you can set the variable to "cached" again.
>>>
>>> I am afraid $action=clean is quite resource consuming but you will 
>>> only need to run your script with this setting when you change the 
>>> registry information.
>>>
>>> Hope this helps,
>>> Best regards,
>>> Thomas
>>> On 23 Sep 2014, at 12:43, Guillermo Marco Puche 
>>> <guillermo.marco at sistemasgenomicos.com 
>>> <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>>>
>>>> Dear Thomas,
>>>>
>>>> My BioMart perl script is using the following mart XML config file:
>>>>
>>>> <?xml version="1.0" encoding="UTF-8"?>
>>>> <!DOCTYPE MartRegistry>
>>>> <MartRegistry>
>>>>      <MartURLLocation
>>>>          name         = "ensembl"
>>>>          displayName  = "ensembl"
>>>>          host         = "www.biomart.org"
>>>>          port         = "80"
>>>>          visible      = "1"
>>>>          default      = ""
>>>>          includeDatasets = "hsapiens_gene_ensembl"
>>>>          martUser     = ""
>>>>      />
>>>> </MartRegistry>
>>>>
>>>> However, I'm getting BioMart queries from hg19 and not hg38. Does 
>>>> the default port '80' still working on hg19? how can I specify I 
>>>> would like to use hg38 biomart service.
>>>>
>>>> On the other hand, it's always useful to know how to query older 
>>>> Ensembl versions with Biomart. However using $action=clean with 
>>>> conf file of ensembl 75 archive it's leading me to massive amounts 
>>>> of RAM consumption in my Perl script.
>>>>
>>>> Thanks.
>>>>
>>>> Best regards,
>>>> Guillermo.
>>>> On 19/09/14 15:48, Thomas Maurel wrote:
>>>>> Dear Guillermo,
>>>>>
>>>>> If you are using:
>>>>> 1) the biomart-perl/scripts/webExample.pl script and an xml file
>>>>> You can change the path to the biomart website in the following line:
>>>>>
>>>>>     my $path="http://www.biomart.org/biomart/martservice?";
>>>>>
>>>>> With the path to our Ensembl release 75 archive:
>>>>>
>>>>>     my
>>>>>     $path="http://feb2014.archive.ensembl.org/biomart/martservice?";
>>>>>
>>>>>
>>>>> 2) A BioMart perl script
>>>>> a) You first need to edit your configuration file in 
>>>>> "biomart-perl/conf/martURLLocation.xml" and paste the content of 
>>>>> the mart registry page for the Ensembl release 75 archive website: 
>>>>> http://feb2014.archive.ensembl.org/biomart/martservice?type=registry
>>>>> b) Then edit your script and make sure that "my $confFile" 
>>>>> variable is looking at the martURLLocation.xml configuration file 
>>>>> in biomart-perl/conf
>>>>> c) Finally, make sure to update the following line in your script:
>>>>>
>>>>> my $action='cached';
>>>>>
>>>>> with:
>>>>>
>>>>> my $action='clean';
>>>>>
>>>>> The first run of your script on Ensembl 75 might be a bit slow as 
>>>>> BioMart will cache some data from the BioMart website.
>>>>> Once you have run your script with the action variable set to 
>>>>> "clean", you can set the variable to "cached" again.
>>>>>
>>>>> Hope this helps,
>>>>> Best regards,
>>>>> Thomas
>>>>> On 19 Sep 2014, at 13:28, Guillermo Marco Puche 
>>>>> <guillermo.marco at sistemasgenomicos.com 
>>>>> <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>>>>>
>>>>>> Dear developers,
>>>>>>
>>>>>> I would like to know how to specify BioMart Perl code to query 
>>>>>> against older Ensembl version (ie:75) and not latest (I believe 
>>>>>> used by default).
>>>>>>
>>>>>> Thank you.
>>>>>>
>>>>>> Best regards,
>>>>>> Guillermo.
>>>>>> _______________________________________________
>>>>>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>>>>> Posting guidelines and subscribe/unsubscribe info: 
>>>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>>> Ensembl Blog: http://www.ensembl.info/
>>>>>
>>>>> --
>>>>> Thomas Maurel
>>>>> Bioinformatician - Ensembl Production Team
>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>> European Molecular Biology Laboratory
>>>>> Wellcome Trust Genome Campus
>>>>> Hinxton
>>>>> Cambridge CB10 1SD
>>>>> United Kingdom
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Dev mailing listDev at ensembl.org
>>>>> Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>>>>> Ensembl Blog:http://www.ensembl.info/
>>>> _______________________________________________
>>>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>>> Posting guidelines and subscribe/unsubscribe info: 
>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>> Ensembl Blog: http://www.ensembl.info/
>>>
>>> --
>>> Thomas Maurel
>>> Bioinformatician - Ensembl Production Team
>>> European Bioinformatics Institute (EMBL-EBI)
>>> European Molecular Biology Laboratory
>>> Wellcome Trust Genome Campus
>>> Hinxton
>>> Cambridge CB10 1SD
>>> United Kingdom
>>>
>>>
>>>
>>> _______________________________________________
>>> Dev mailing listDev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog:http://www.ensembl.info/
>>
>>
>>
>>
>> _______________________________________________
>> Dev mailing listDev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog:http://www.ensembl.info/
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20140924/e8d0f649/attachment.html>


More information about the Dev mailing list