[ensembl-dev] BioMart Ensembl version

Thomas Maurel maurel at ebi.ac.uk
Wed Sep 24 14:43:45 BST 2014


Dear Guillermo,

No worries. I am afraid we don't declare the mart internal name modifications at the moment but we will start from release 77 onward, these modifications will be listed on the following page: http://www.ensembl.org/info/website/news.html#cat-other.

Hope this helps,
Regards,
Thomas
On 24 Sep 2014, at 13:26, Guillermo Marco Puche <guillermo.marco at sistemasgenomicos.com> wrote:

> Dear Thomas,
> 
> Thank you very much for your fast answer. This was driving me mad!
> Is there any changelog available of this kind of modifications? I would like to read it for future releases so I don't have to disturb you in case a method stops working.
> 
> Thank you again.
> 
> Best regards,
> Guillermo.
> 
> On 24/09/14 13:40, Thomas Maurel wrote:
>> Dear Guillermo,
>> 
>> I am afraid we have renamed the "external_gene_id" attribute internal name to "external_gene_name" in release 76. Renaming the attribute in your script should resolve your problem.
>> We rarely rename attribute internal name but I am afraid we don't declare these at the moment since they don't impact the BioMart visual interface. If an important filter/attribute is remove from the interface or renamed on the interface then we always declare it.
>> 
>> Apologies for any inconvenience cause,
>> Regards,
>> Thomas
>> On 24 Sep 2014, at 12:13, Guillermo Marco Puche <guillermo.marco at sistemasgenomicos.com> wrote:
>> 
>>> Dear Thomas,
>>> 
>>> I'm experiencing an error when querying to BioMart Ensembl 76 registry for both clean/cached.
>>> Using older Mart Registry to ensembl 75 registry  archive works for both clean/cached.
>>> 
>>>     my $action='clean';
>>>     my $initializer = BioMart::Initializer->new('registryFile'=>$confFile, 'action'=>$action);
>>>     my $registry = $initializer->getRegistry;
>>> 
>>>     my $query = BioMart::Query->new('registry'=>$registry,'virtualSchemaName'=>'default');
>>> 
>>>         $query->setDataset("hsapiens_gene_ensembl");
>>>         if ( length($gene_name)==15 && substr($gene_name,0,4) eq "ENSG" ) { $query->addFilter("ensembl_gene_id", @{$gene}); }
>>>         else { $query->addFilter("hgnc_symbol", @{$gene}); }
>>>         if ($query_type eq "gene_info"){
>>>             #$query->addFilter("biotype", ["protein_coding"]);
>>>             $query->addAttribute("ensembl_gene_id");
>>>             $query->addAttribute("ensembl_transcript_id");
>>>             $query->addAttribute("chromosome_name");
>>>             $query->addAttribute("external_gene_id");
>>>         }
>>> Was this removed in Ensembl 76? Is there any documentation about BioMart available queries per Ensembl release?
>>> 
>>> Thank you and sorry for spam, I never had this issues when changing Ensembl previously.
>>> 
>>> Best regards,
>>> Guillermo.
>>> 
>>> 
>>> On 23/09/14 15:10, Guillermo Marco Puche wrote:
>>>> Dear Thomas,
>>>> 
>>>> Ok I didn't understand correctly, now I do.
>>>> As you say biomart website registry is still showing ensembl 75.
>>>> 
>>>> Thank you.
>>>> 
>>>> Best regards,
>>>> Guillermo.
>>>> 
>>>> On 23/09/14 14:01, Thomas Maurel wrote:
>>>>> Dear Guillermo,
>>>>> 
>>>>> The biomart.org website seems to be very slow at the moment and I am afraid the website is still displaying our release 75 marts on hg19 (GRCh37). According to the biomart.org mart registry page: http://www.biomart.org/biomart/martservice?type=registry, port '80' is still valid.
>>>>> If you want to use hg38 (GRCh38), the best way would be to point your mart XML config to the ensembl.org website in order to access our release 76 mart databases on hg38 (GRCh38).
>>>>> You can follow the previous instruction but just change the mart registry URL page to get the mart release 76 registry informations:
>>>>> 
>>>>>>> 2) A BioMart perl script
>>>>>>> a) You first need to edit your configuration file in "biomart-perl/conf/martURLLocation.xml" and paste the content of the mart registry page for the
>>>>> Ensembl website on release 76 (hg38): http://www.ensembl.org/biomart/martservice?type=registry
>>>>>>> b) Then edit your script and make sure that "my $confFile" variable is looking at the martURLLocation.xml configuration file in biomart-perl/conf
>>>>>>> c) Finally, make sure to update the following line in your script:
>>>>>>> 
>>>>>>> my $action='cached';
>>>>>>> 
>>>>>>> with:
>>>>>>> 
>>>>>>> my $action='clean';
>>>>>>> 
>>>>>>> The first run of your script on Ensembl 75 might be a bit slow as BioMart will cache some data from the BioMart website.
>>>>>>> Once you have run your script with the action variable set to "clean", you can set the variable to "cached" again.
>>>>> 
>>>>> 
>>>>> I am afraid $action=clean is quite resource consuming but you will only need to run your script with this setting when you change the registry information.
>>>>> 
>>>>> Hope this helps,
>>>>> Best regards,
>>>>> Thomas
>>>>> On 23 Sep 2014, at 12:43, Guillermo Marco Puche <guillermo.marco at sistemasgenomicos.com> wrote:
>>>>> 
>>>>>> Dear Thomas,
>>>>>> 
>>>>>> My BioMart perl script is using the following mart XML config file:
>>>>>> 
>>>>>> <?xml version="1.0" encoding="UTF-8"?>
>>>>>> <!DOCTYPE MartRegistry>
>>>>>> <MartRegistry>
>>>>>>     <MartURLLocation
>>>>>>         name         = "ensembl"
>>>>>>         displayName  = "ensembl"
>>>>>>         host         = "www.biomart.org"
>>>>>>         port         = "80"
>>>>>>         visible      = "1"
>>>>>>         default      = ""
>>>>>>         includeDatasets = "hsapiens_gene_ensembl"
>>>>>>         martUser     = ""
>>>>>>     />
>>>>>> </MartRegistry>
>>>>>> 
>>>>>> However, I'm getting BioMart queries from hg19 and not hg38. Does the default port '80' still working on hg19? how can I specify I would like to use hg38 biomart service.
>>>>>> 
>>>>>> On the other hand, it's always useful to know how to query older Ensembl versions with Biomart. However using $action=clean with conf file of ensembl 75 archive it's leading me to massive amounts of RAM consumption in my Perl script.
>>>>>> 
>>>>>> Thanks.
>>>>>> 
>>>>>> Best regards,
>>>>>> Guillermo.
>>>>>> On 19/09/14 15:48, Thomas Maurel wrote:
>>>>>>> Dear Guillermo,
>>>>>>> 
>>>>>>> If you are using:
>>>>>>> 1) the biomart-perl/scripts/webExample.pl script and an xml file
>>>>>>> You can change the path to the biomart website in the following line:
>>>>>>> my $path="http://www.biomart.org/biomart/martservice?";
>>>>>>> With the path to our Ensembl release 75 archive:
>>>>>>> my $path="http://feb2014.archive.ensembl.org/biomart/martservice?";
>>>>>>> 
>>>>>>> 2) A BioMart perl script
>>>>>>> a) You first need to edit your configuration file in "biomart-perl/conf/martURLLocation.xml" and paste the content of the mart registry page for the Ensembl release 75 archive website: http://feb2014.archive.ensembl.org/biomart/martservice?type=registry
>>>>>>> b) Then edit your script and make sure that "my $confFile" variable is looking at the martURLLocation.xml configuration file in biomart-perl/conf
>>>>>>> c) Finally, make sure to update the following line in your script:
>>>>>>> 
>>>>>>> my $action='cached';
>>>>>>> 
>>>>>>> with:
>>>>>>> 
>>>>>>> my $action='clean';
>>>>>>> 
>>>>>>> The first run of your script on Ensembl 75 might be a bit slow as BioMart will cache some data from the BioMart website.
>>>>>>> Once you have run your script with the action variable set to "clean", you can set the variable to "cached" again.
>>>>>>> 
>>>>>>> Hope this helps,
>>>>>>> Best regards,
>>>>>>> Thomas
>>>>>>> On 19 Sep 2014, at 13:28, Guillermo Marco Puche <guillermo.marco at sistemasgenomicos.com> wrote:
>>>>>>> 
>>>>>>>> Dear developers,
>>>>>>>> 
>>>>>>>> I would like to know how to specify BioMart Perl code to query against older Ensembl version (ie:75) and not latest (I believe used by default).
>>>>>>>> 
>>>>>>>> Thank you.
>>>>>>>> 
>>>>>>>> Best regards,
>>>>>>>> Guillermo.
>>>>>>>> _______________________________________________
>>>>>>>> Dev mailing list    Dev at ensembl.org
>>>>>>>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>>>>>>>> Ensembl Blog: http://www.ensembl.info/
>>>>>>> 
>>>>>>> --
>>>>>>> Thomas Maurel
>>>>>>> Bioinformatician - Ensembl Production Team
>>>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>>>> European Molecular Biology Laboratory
>>>>>>> Wellcome Trust Genome Campus
>>>>>>> Hinxton
>>>>>>> Cambridge CB10 1SD
>>>>>>> United Kingdom
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> Dev mailing list    Dev at ensembl.org
>>>>>>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>>>>>>> Ensembl Blog: http://www.ensembl.info/
>>>>>> _______________________________________________
>>>>>> Dev mailing list    Dev at ensembl.org
>>>>>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>>>>>> Ensembl Blog: http://www.ensembl.info/
>>>>> 
>>>>> --
>>>>> Thomas Maurel
>>>>> Bioinformatician - Ensembl Production Team
>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>> European Molecular Biology Laboratory
>>>>> Wellcome Trust Genome Campus
>>>>> Hinxton
>>>>> Cambridge CB10 1SD
>>>>> United Kingdom
>>>>> 
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> Dev mailing list    Dev at ensembl.org
>>>>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>>>>> Ensembl Blog: http://www.ensembl.info/
>>>> 
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Dev mailing list    Dev at ensembl.org
>>>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>>>> Ensembl Blog: http://www.ensembl.info/
>>> _______________________________________________
>>> Dev mailing list    Dev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>> 
>> --
>> Thomas Maurel
>> Bioinformatician - Ensembl Production Team
>> European Bioinformatics Institute (EMBL-EBI)
>> European Molecular Biology Laboratory
>> Wellcome Trust Genome Campus
>> Hinxton
>> Cambridge CB10 1SD
>> United Kingdom
>> 
>> 
>> 
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

--
Thomas Maurel
Bioinformatician - Ensembl Production Team
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
United Kingdom

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20140924/8e35b99a/attachment.html>


More information about the Dev mailing list