[ensembl-dev] BioMart Ensembl version
Guillermo Marco Puche
guillermo.marco at sistemasgenomicos.com
Wed Sep 24 13:26:43 BST 2014
Dear Thomas,
Thank you very much for your fast answer. This was driving me mad!
Is there any changelog available of this kind of modifications? I would
like to read it for future releases so I don't have to disturb you in
case a method stops working.
Thank you again.
Best regards,
Guillermo.
On 24/09/14 13:40, Thomas Maurel wrote:
> Dear Guillermo,
>
> I am afraid we have renamed the "external_gene_id" attribute internal
> name to "external_gene_name" in release 76. Renaming the attribute in
> your script should resolve your problem.
> We rarely rename attribute internal name but I am afraid we don't
> declare these at the moment since they don't impact the BioMart visual
> interface. If an important filter/attribute is remove from the
> interface or renamed on the interface then we always declare it.
>
> Apologies for any inconvenience cause,
> Regards,
> Thomas
> On 24 Sep 2014, at 12:13, Guillermo Marco Puche
> <guillermo.marco at sistemasgenomicos.com
> <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>
>> Dear Thomas,
>>
>> I'm experiencing an error when querying to BioMart Ensembl 76
>> registry for both clean/cached.
>> Using older Mart Registry to ensembl 75 registry archive works for
>> both clean/cached.
>>
>> my $action='clean';
>>
>> my $initializer =
>> BioMart::Initializer->new('registryFile'=>$confFile,
>> 'action'=>$action);
>> my $registry = $initializer->getRegistry;
>>
>> my $query =
>> BioMart::Query->new('registry'=>$registry,'virtualSchemaName'=>'default');
>>
>> $query->setDataset("hsapiens_gene_ensembl");
>> if ( length($gene_name)==15 && substr($gene_name,0,4)
>> eq "ENSG" ) { $query->addFilter("ensembl_gene_id", @{$gene}); }
>> else { $query->addFilter("hgnc_symbol", @{$gene}); }
>> if ($query_type eq "gene_info"){
>> #$query->addFilter("biotype", ["protein_coding"]);
>> $query->addAttribute("ensembl_gene_id");
>> $query->addAttribute("ensembl_transcript_id");
>> $query->addAttribute("chromosome_name");
>> *$query->addAttribute("external_gene_id");*
>> }
>>
>> Was this removed in Ensembl 76? Is there any documentation about
>> BioMart available queries per Ensembl release?
>>
>> Thank you and sorry for spam, I never had this issues when changing
>> Ensembl previously.
>>
>> Best regards,
>> Guillermo.
>>
>>
>> On 23/09/14 15:10, Guillermo Marco Puche wrote:
>>> Dear Thomas,
>>>
>>> Ok I didn't understand correctly, now I do.
>>> As you say biomart website registry is still showing ensembl 75.
>>>
>>> Thank you.
>>>
>>> Best regards,
>>> Guillermo.
>>>
>>> On 23/09/14 14:01, Thomas Maurel wrote:
>>>> Dear Guillermo,
>>>>
>>>> The biomart.org <http://biomart.org/> website seems to be very slow
>>>> at the moment and I am afraid the website is still displaying our
>>>> release 75 marts on hg19 (GRCh37). According to the biomart.org
>>>> <http://biomart.org/> mart registry page:
>>>> http://www.biomart.org/biomart/martservice?type=registry, port '80'
>>>> is still valid.
>>>> If you want to use hg38 (GRCh38), the best way would be to point
>>>> your mart XML config to the ensembl.org <http://ensembl.org/>
>>>> website in order to access our release 76 mart databases on hg38
>>>> (GRCh38).
>>>> You can follow the previous instruction but just change the mart
>>>> registry URL page to get the mart release 76 registry informations:
>>>>
>>>>>> 2) A BioMart perl script
>>>>>> a) You first need to edit your configuration file in
>>>>>> "biomart-perl/conf/martURLLocation.xml" and paste the content of
>>>>>> the mart registry page for the
>>>> Ensembl website on release 76 (hg38):
>>>> http://www.ensembl.org/biomart/martservice?type=registry
>>>>>> b) Then edit your script and make sure that "my $confFile"
>>>>>> variable is looking at the martURLLocation.xml configuration file
>>>>>> in biomart-perl/conf
>>>>>> c) Finally, make sure to update the following line in your script:
>>>>>>
>>>>>> my $action='cached';
>>>>>>
>>>>>> with:
>>>>>>
>>>>>> my $action='clean';
>>>>>>
>>>>>> The first run of your script on Ensembl 75 might be a bit slow as
>>>>>> BioMart will cache some data from the BioMart website.
>>>>>> Once you have run your script with the action variable set to
>>>>>> "clean", you can set the variable to "cached" again.
>>>>
>>>> I am afraid $action=clean is quite resource consuming but you will
>>>> only need to run your script with this setting when you change the
>>>> registry information.
>>>>
>>>> Hope this helps,
>>>> Best regards,
>>>> Thomas
>>>> On 23 Sep 2014, at 12:43, Guillermo Marco Puche
>>>> <guillermo.marco at sistemasgenomicos.com
>>>> <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>>>>
>>>>> Dear Thomas,
>>>>>
>>>>> My BioMart perl script is using the following mart XML config file:
>>>>>
>>>>> <?xml version="1.0" encoding="UTF-8"?>
>>>>> <!DOCTYPE MartRegistry>
>>>>> <MartRegistry>
>>>>> <MartURLLocation
>>>>> name = "ensembl"
>>>>> displayName = "ensembl"
>>>>> host = "www.biomart.org"
>>>>> port = "80"
>>>>> visible = "1"
>>>>> default = ""
>>>>> includeDatasets = "hsapiens_gene_ensembl"
>>>>> martUser = ""
>>>>> />
>>>>> </MartRegistry>
>>>>>
>>>>> However, I'm getting BioMart queries from hg19 and not hg38. Does
>>>>> the default port '80' still working on hg19? how can I specify I
>>>>> would like to use hg38 biomart service.
>>>>>
>>>>> On the other hand, it's always useful to know how to query older
>>>>> Ensembl versions with Biomart. However using $action=clean with
>>>>> conf file of ensembl 75 archive it's leading me to massive amounts
>>>>> of RAM consumption in my Perl script.
>>>>>
>>>>> Thanks.
>>>>>
>>>>> Best regards,
>>>>> Guillermo.
>>>>> On 19/09/14 15:48, Thomas Maurel wrote:
>>>>>> Dear Guillermo,
>>>>>>
>>>>>> If you are using:
>>>>>> 1) the biomart-perl/scripts/webExample.pl script and an xml file
>>>>>> You can change the path to the biomart website in the following line:
>>>>>>
>>>>>> my $path="http://www.biomart.org/biomart/martservice?";
>>>>>>
>>>>>> With the path to our Ensembl release 75 archive:
>>>>>>
>>>>>> my
>>>>>> $path="http://feb2014.archive.ensembl.org/biomart/martservice?";
>>>>>>
>>>>>>
>>>>>> 2) A BioMart perl script
>>>>>> a) You first need to edit your configuration file in
>>>>>> "biomart-perl/conf/martURLLocation.xml" and paste the content of
>>>>>> the mart registry page for the Ensembl release 75 archive
>>>>>> website:
>>>>>> http://feb2014.archive.ensembl.org/biomart/martservice?type=registry
>>>>>> b) Then edit your script and make sure that "my $confFile"
>>>>>> variable is looking at the martURLLocation.xml configuration file
>>>>>> in biomart-perl/conf
>>>>>> c) Finally, make sure to update the following line in your script:
>>>>>>
>>>>>> my $action='cached';
>>>>>>
>>>>>> with:
>>>>>>
>>>>>> my $action='clean';
>>>>>>
>>>>>> The first run of your script on Ensembl 75 might be a bit slow as
>>>>>> BioMart will cache some data from the BioMart website.
>>>>>> Once you have run your script with the action variable set to
>>>>>> "clean", you can set the variable to "cached" again.
>>>>>>
>>>>>> Hope this helps,
>>>>>> Best regards,
>>>>>> Thomas
>>>>>> On 19 Sep 2014, at 13:28, Guillermo Marco Puche
>>>>>> <guillermo.marco at sistemasgenomicos.com
>>>>>> <mailto:guillermo.marco at sistemasgenomicos.com>> wrote:
>>>>>>
>>>>>>> Dear developers,
>>>>>>>
>>>>>>> I would like to know how to specify BioMart Perl code to query
>>>>>>> against older Ensembl version (ie:75) and not latest (I believe
>>>>>>> used by default).
>>>>>>>
>>>>>>> Thank you.
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Guillermo.
>>>>>>> _______________________________________________
>>>>>>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>>>>>> Posting guidelines and subscribe/unsubscribe info:
>>>>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>>>> Ensembl Blog: http://www.ensembl.info/
>>>>>>
>>>>>> --
>>>>>> Thomas Maurel
>>>>>> Bioinformatician - Ensembl Production Team
>>>>>> European Bioinformatics Institute (EMBL-EBI)
>>>>>> European Molecular Biology Laboratory
>>>>>> Wellcome Trust Genome Campus
>>>>>> Hinxton
>>>>>> Cambridge CB10 1SD
>>>>>> United Kingdom
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Dev mailing listDev at ensembl.org
>>>>>> Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>>>>>> Ensembl Blog:http://www.ensembl.info/
>>>>> _______________________________________________
>>>>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>>>> Posting guidelines and subscribe/unsubscribe info:
>>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>> Ensembl Blog: http://www.ensembl.info/
>>>>
>>>> --
>>>> Thomas Maurel
>>>> Bioinformatician - Ensembl Production Team
>>>> European Bioinformatics Institute (EMBL-EBI)
>>>> European Molecular Biology Laboratory
>>>> Wellcome Trust Genome Campus
>>>> Hinxton
>>>> Cambridge CB10 1SD
>>>> United Kingdom
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Dev mailing listDev at ensembl.org
>>>> Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>>>> Ensembl Blog:http://www.ensembl.info/
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Dev mailing listDev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog:http://www.ensembl.info/
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>
> --
> Thomas Maurel
> Bioinformatician - Ensembl Production Team
> European Bioinformatics Institute (EMBL-EBI)
> European Molecular Biology Laboratory
> Wellcome Trust Genome Campus
> Hinxton
> Cambridge CB10 1SD
> United Kingdom
>
>
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20140924/34451a7d/attachment.html>
More information about the Dev
mailing list