[ensembl-dev] BioMart Ensembl version

Guillermo Marco Puche guillermo.marco at sistemasgenomicos.com
Wed Sep 24 13:26:43 BST 2014


Dear Thomas,

Thank you very much for your fast answer. This was driving me mad!
Is there any changelog available of this kind of modifications? I would 
like to read it for future releases so I don't have to disturb you in 
case a method stops working.

Thank you again.

Best regards,
Guillermo.

On 24/09/14 13:40, Thomas Maurel wrote:
> Dear Guillermo,
>
> I am afraid we have renamed the "external_gene_id" attribute internal 
> name to "external_gene_name" in release 76. Renaming the attribute in 
> your script should resolve your problem.
> We rarely rename attribute internal name but I am afraid we don't 
> declare these at the moment since they don't impact the BioMart visual 
> interface. If an important filter/attribute is remove from the 
> interface or renamed on the interface then we always declare it.
>
> Apologies for any inconvenience cause,
> Regards,
> Thomas
> On 24 Sep 2014, at 12:13, Guillermo Marco Puche 
> <guillermo.marco at sistemasgenomicos.com 
> <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>
>> Dear Thomas,
>>
>> I'm experiencing an error when querying to BioMart Ensembl 76 
>> registry for both clean/cached.
>> Using older Mart Registry to ensembl 75 registry  archive works for 
>> both clean/cached.
>>
>>     my $action='clean';
>>
>>             my $initializer =
>>         BioMart::Initializer->new('registryFile'=>$confFile,
>>         'action'=>$action);
>>             my $registry = $initializer->getRegistry;
>>
>>             my $query =
>>         BioMart::Query->new('registry'=>$registry,'virtualSchemaName'=>'default');
>>
>>         $query->setDataset("hsapiens_gene_ensembl");
>>                 if ( length($gene_name)==15 && substr($gene_name,0,4)
>>         eq "ENSG" ) { $query->addFilter("ensembl_gene_id", @{$gene}); }
>>                 else { $query->addFilter("hgnc_symbol", @{$gene}); }
>>                 if ($query_type eq "gene_info"){
>>                     #$query->addFilter("biotype", ["protein_coding"]);
>>         $query->addAttribute("ensembl_gene_id");
>>         $query->addAttribute("ensembl_transcript_id");
>>         $query->addAttribute("chromosome_name");
>>         *$query->addAttribute("external_gene_id");*
>>                 }
>>
>> Was this removed in Ensembl 76? Is there any documentation about 
>> BioMart available queries per Ensembl release?
>>
>> Thank you and sorry for spam, I never had this issues when changing 
>> Ensembl previously.
>>
>> Best regards,
>> Guillermo.
>>
>>
>> On 23/09/14 15:10, Guillermo Marco Puche wrote:
>>> Dear Thomas,
>>>
>>> Ok I didn't understand correctly, now I do.
>>> As you say biomart website registry is still showing ensembl 75.
>>>
>>> Thank you.
>>>
>>> Best regards,
>>> Guillermo.
>>>
>>> On 23/09/14 14:01, Thomas Maurel wrote:
>>>> Dear Guillermo,
>>>>
>>>> The biomart.org <http://biomart.org/> website seems to be very slow 
>>>> at the moment and I am afraid the website is still displaying our 
>>>> release 75 marts on hg19 (GRCh37). According to the biomart.org 
>>>> <http://biomart.org/> mart registry page: 
>>>> http://www.biomart.org/biomart/martservice?type=registry, port '80' 
>>>> is still valid.
>>>> If you want to use hg38 (GRCh38), the best way would be to point 
>>>> your mart XML config to the ensembl.org <http://ensembl.org/> 
>>>> website in order to access our release 76 mart databases on hg38 
>>>> (GRCh38).
>>>> You can follow the previous instruction but just change the mart 
>>>> registry URL page to get the mart release 76 registry informations:
>>>>
>>>>>> 2) A BioMart perl script
>>>>>> a) You first need to edit your configuration file in 
>>>>>> "biomart-perl/conf/martURLLocation.xml" and paste the content of 
>>>>>> the mart registry page for the
>>>> Ensembl website on release 76 (hg38): 
>>>> http://www.ensembl.org/biomart/martservice?type=registry
>>>>>> b) Then edit your script and make sure that "my $confFile" 
>>>>>> variable is looking at the martURLLocation.xml configuration file 
>>>>>> in biomart-perl/conf
>>>>>> c) Finally, make sure to update the following line in your script:
>>>>>>
>>>>>> my $action='cached';
>>>>>>
>>>>>> with:
>>>>>>
>>>>>> my $action='clean';
>>>>>>
>>>>>> The first run of your script on Ensembl 75 might be a bit slow as 
>>>>>> BioMart will cache some data from the BioMart website.
>>>>>> Once you have run your script with the action variable set to 
>>>>>> "clean", you can set the variable to "cached" again.
>>>>
>>>> I am afraid $action=clean is quite resource consuming but you will 
>>>> only need to run your script with this setting when you change the 
>>>> registry information.
>>>>
>>>> Hope this helps,
>>>> Best regards,
>>>> Thomas
>>>> On 23 Sep 2014, at 12:43, Guillermo Marco Puche 
>>>> <guillermo.marco at sistemasgenomicos.com 
>>>> <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>>>>
>>>>> Dear Thomas,
>>>>>
>>>>> My BioMart perl script is using the following mart XML config file:
>>>>>
>>>>> <?xml version="1.0" encoding="UTF-8"?>
>>>>> <!DOCTYPE MartRegistry>
>>>>> <MartRegistry>
>>>>>      <MartURLLocation
>>>>>          name         = "ensembl"
>>>>>          displayName  = "ensembl"
>>>>>          host         = "www.biomart.org"
>>>>>          port         = "80"
>>>>>          visible      = "1"
>>>>>          default      = ""
>>>>>          includeDatasets = "hsapiens_gene_ensembl"
>>>>>          martUser     = ""
>>>>>      />
>>>>> </MartRegistry>
>>>>>
>>>>> However, I'm getting BioMart queries from hg19 and not hg38. Does 
>>>>> the default port '80' still working on hg19? how can I specify I 
>>>>> would like to use hg38 biomart service.
>>>>>
>>>>> On the other hand, it's always useful to know how to query older 
>>>>> Ensembl versions with Biomart. However using $action=clean with 
>>>>> conf file of ensembl 75 archive it's leading me to massive amounts 
>>>>> of RAM consumption in my Perl script.
>>>>>
>>>>> Thanks.
>>>>>
>>>>> Best regards,
>>>>> Guillermo.
>>>>> On 19/09/14 15:48, Thomas Maurel wrote:
>>>>>> Dear Guillermo,
>>>>>>
>>>>>> If you are using:
>>>>>> 1) the biomart-perl/scripts/webExample.pl script and an xml file
>>>>>> You can change the path to the biomart website in the following line:
>>>>>>
>>>>>>     my $path="http://www.biomart.org/biomart/martservice?";
>>>>>>
>>>>>> With the path to our Ensembl release 75 archive:
>>>>>>
>>>>>>     my
>>>>>>     $path="http://feb2014.archive.ensembl.org/biomart/martservice?";
>>>>>>
>>>>>>
>>>>>> 2) A BioMart perl script
>>>>>> a) You first need to edit your configuration file in 
>>>>>> "biomart-perl/conf/martURLLocation.xml" and paste the content of 
>>>>>> the mart registry page for the Ensembl release 75 archive 
>>>>>> website: 
>>>>>> http://feb2014.archive.ensembl.org/biomart/martservice?type=registry
>>>>>> b) Then edit your script and make sure that "my $confFile" 
>>>>>> variable is looking at the martURLLocation.xml configuration file 
>>>>>> in biomart-perl/conf
>>>>>> c) Finally, make sure to update the following line in your script:
>>>>>>
>>>>>> my $action='cached';
>>>>>>
>>>>>> with:
>>>>>>
>>>>>> my $action='clean';
>>>>>>
>>>>>> The first run of your script on Ensembl 75 might be a bit slow as 
>>>>>> BioMart will cache some data from the BioMart website.
>>>>>> Once you have run your script with the action variable set to 
>>>>>> "clean", you can set the variable to "cached" again.
>>>>>>
>>>>>> Hope this helps,
>>>>>> Best regards,
>>>>>> Thomas
>>>>>> On 19 Sep 2014, at 13:28, Guillermo Marco Puche 
>>>>>> <guillermo.marco at sistemasgenomicos.com 
>>>>>> <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>>>>>>
>>>>>>> Dear developers,
>>>>>>>
>>>>>>> I would like to know how to specify BioMart Perl code to query 
>>>>>>> against older Ensembl version (ie:75) and not latest (I believe 
>>>>>>> used by default).
>>>>>>>
>>>>>>> Thank you.
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Guillermo.
>>>>>>> _______________________________________________
>>>>>>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>>>>>> Posting guidelines and subscribe/unsubscribe info: 
>>>>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>>>> Ensembl Blog: http://www.ensembl.info/
>>>>>>
>>>>>> --
>>>>>> Thomas Maurel
>>>>>> Bioinformatician - Ensembl Production Team
>>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>>> European Molecular Biology Laboratory
>>>>>> Wellcome Trust Genome Campus
>>>>>> Hinxton
>>>>>> Cambridge CB10 1SD
>>>>>> United Kingdom
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Dev mailing listDev at ensembl.org
>>>>>> Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>>>>>> Ensembl Blog:http://www.ensembl.info/
>>>>> _______________________________________________
>>>>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>>>> Posting guidelines and subscribe/unsubscribe info: 
>>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>> Ensembl Blog: http://www.ensembl.info/
>>>>
>>>> --
>>>> Thomas Maurel
>>>> Bioinformatician - Ensembl Production Team
>>>> European Bioinformatics Institute (EMBL-EBI)
>>>> European Molecular Biology Laboratory
>>>> Wellcome Trust Genome Campus
>>>> Hinxton
>>>> Cambridge CB10 1SD
>>>> United Kingdom
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Dev mailing listDev at ensembl.org
>>>> Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>>>> Ensembl Blog:http://www.ensembl.info/
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Dev mailing listDev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog:http://www.ensembl.info/
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>> Posting guidelines and subscribe/unsubscribe info: 
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>
> --
> Thomas Maurel
> Bioinformatician - Ensembl Production Team
> European Bioinformatics Institute (EMBL-EBI)
> European Molecular Biology Laboratory
> Wellcome Trust Genome Campus
> Hinxton
> Cambridge CB10 1SD
> United Kingdom
>
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20140924/34451a7d/attachment.html>


More information about the Dev mailing list