[ensembl-dev] versioning of ensembl & biomart

vincent ranwez vincent.ranwez at univ-montp2.fr
Wed Dec 14 12:37:08 GMT 2011


Hi,

thanks for your answer. Do you think that the canonical transcript stable id will be available through biomart in next ensembl releases. If so I will continue to work with release 64 and wait until release 66; or do you think that this removal is definitive and in this case I have to find alternative solution such as using the perl API as you suggest (but it's annoy me to have a mix of biomart and perl API script).

Vincent
 
Le 14 déc. 2011 à 13:22, rhoda at ebi.ac.uk a écrit :

> Hi Vincent
> The canonical transcript stable id is no longer available in BioMart as
> the changes made to the core schema (merging of the stable_id tables with
> their parent tables) made it impossible to add this data using the
> martbuilder tool. Apologies, this information was accidentally omitted
> from the mart news. You can obtain this information using the perl API if
> you still require it.
> Regards
> Rhoda
> 
> 
>> Hi,
>> 
>> I adapt my code to handle the special case of the devil. So that know my
>> first script is OK. But I encounter a problem when I try to collect
>> canonical transcript Id. This feature was available in release 64
>> <Dataset name = "hsapiens_gene_ensembl" interface = "default" >
>> 		<Attribute name = "ensembl_gene_id" />
>> 		<Attribute name = "canonical_transcript_stable_id" />
>> 	</Dataset>
>> 
>> but it seems to have disappear from release 65 (my script return an
>> error). I checked the ensembl web interface of biomart and this
>> "attribute" was present in the graphical interface of v64 but disappear
>> from the interface of release 65. Is it a bug or is the same information
>> available somewhere else under a new name ?
>> 
>> thank you for your help
>> 
>> Vincent
>> 
>> Le 14 déc. 2011 à 11:24, rhoda at ebi.ac.uk a écrit :
>> 
>>> Hi Vincent
>>> I am glad that your script now works. With regard to the attribute name
>>> for the Tasmanian devil, this is just an internal naming of this
>>> attribute
>>> in the configuration for the website and does not make the data
>>> retrieved
>>> from this column in the mart database any less reliable. We have
>>> recently
>>> had a discussion with the Ensembl Compara team and are planning to tidy
>>> up
>>> the configuration for release 66 and I will make sure that the internal
>>> naming is more consistent from release 66 onward. Thank you for your
>>> feedback.
>>> Regards
>>> Rhoda
>>> 
>>> 
>>>> Hi,
>>>> 
>>>> I launch my script this morning on your server and it works fine except
>>>> for the tasmania devil. Indeed the orthology field for this species
>>>> does
>>>> not use the same convention as other species :
>>>> <Dataset name = "hsapiens_gene_ensembl" interface = "default" >
>>>> 		<Attribute name = "ensembl_gene_id" />
>>>> 		<Attribute name = "ensembl_transcript_id" />
>>>> 		<Attribute name = "devil_ensembl_gene" />
>>>> 		<Attribute name = "homolog_shar__dm_description_4014" />
>>>> 		<Attribute name = "cat_ensembl_gene" />
>>>> 		<Attribute name = "cat_orthology_type" />
>>>> 		<Attribute name = "chimp_ensembl_gene" />
>>>> 		<Attribute name = "chimp_orthology_type" />
>>>> 	</Dataset>
>>>> So I have to make a special case for this species. This is a minor
>>>> problem, but I was wondering if it has a special meaning (i.e. less
>>>> reliable prediction of the homology ?)
>>>> 
>>>> thanks for your help,
>>>> 
>>>> Vincent
>>>> 
>>>> Le 14 déc. 2011 à 09:05, rhoda at ebi.ac.uk a écrit :
>>>> 
>>>>> Hi Vincent
>>>>> Unfortunately, someone has been hitting our mart servers with a lot of
>>>>> queries over the past few days and we are trying to resolve the
>>>>> connectivity issue. Can you try your query again today and let me know
>>>>> if
>>>>> you can retrieve your results? You will need to keep an eye on
>>>>> www.biomart.org to determine when this is updated with the new release
>>>>> 65
>>>>> databases or perhaps you could subscribe to the biomart users mailing
>>>>> list
>>>>> (users at biomart.org). The BioMart team are generally quite quick to
>>>>> update
>>>>> these databases once we have added them to the public mysql server and
>>>>> they usually let me know when they have been added to the biomart
>>>>> central
>>>>> portal. I then email the biomart users mailing list and bioconductor
>>>>> mailing list to let everyone know about the fixes and additions in the
>>>>> new
>>>>> marts. I hope that helps.
>>>>> Regards
>>>>> Rhoda
>>>>> 
>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> thank you very much for this answer. I tried to set the path to
>>>>>> "http://www.ensembl.org/biomart/martservice?" but I got error message
>>>>>> 'too
>>>>>> many connection" it is thus probably wiser to wait a couple of days
>>>>>> so
>>>>>> that "http://www.biomart.org/biomart/martservice?" will be updated
>>>>>> and
>>>>>> ensembl server less loaded... Is there a way to know when this server
>>>>>> is
>>>>>> updated (apart from launching a request that give different result on
>>>>>> ensembl v64 and v65) ?
>>>>>> 
>>>>>> Thank you again, I really appreciate the reactivity of the Ensembl
>>>>>> team
>>>>>> on
>>>>>> this forum I think this is part of Ensembl success.
>>>>>> 
>>>>>> sincerely,
>>>>>> 
>>>>>> Vincent
>>>>>> 
>>>>>> 
>>>>>> Le 13 déc. 2011 à 13:53, Rhoda Kinsella a écrit :
>>>>>> 
>>>>>>> Hi Vincent
>>>>>>> In your webExample.pl script you are pointing to
>>>>>>> "http://www.biomart.org/biomart/martservice?" which should always
>>>>>>> point
>>>>>>> to the most recent Ensembl release. As it has only been a few days
>>>>>>> since
>>>>>>> the Ensembl release 65, the www.biomart.org central portal has not
>>>>>>> yet
>>>>>>> been updated to include the new databases. I expect that these will
>>>>>>> be
>>>>>>> updated some time this week. If you would like to use the Ensembl
>>>>>>> release 65 mart databases, you should set your path to
>>>>>>> "http://www.ensembl.org/biomart/martservice?" and then run your
>>>>>>> query.
>>>>>>> To obtain various archive releases, first determine the URL for the
>>>>>>> archive you wish to access using the following link:
>>>>>>> 
>>>>>>> http://www.ensembl.org/info/website/archives/index.html
>>>>>>> 
>>>>>>> If you select Ensembl release 63 from the list on the right hand
>>>>>>> side
>>>>>>> of
>>>>>>> the screen, and click on the BioMart link at the top of the page
>>>>>>> this
>>>>>>> will bring you to this URL:
>>>>>>> 
>>>>>>> http://jun2011.archive.ensembl.org/biomart/martview/
>>>>>>> 
>>>>>>> You can use this URL in the webExample.pl script if you modify the
>>>>>>> URL
>>>>>>> to have "/martservice?" at the end of the URL, like this:
>>>>>>> 
>>>>>>> http://jun2011.archive.ensembl.org/biomart/martservice?
>>>>>>> 
>>>>>>> This will allow you to query the Ensembl release 63 marts. I hope
>>>>>>> this
>>>>>>> helps but please don't hesitate to contact me if you have further
>>>>>>> questions.
>>>>>>> Regards
>>>>>>> Rhoda
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On 13 Dec 2011, at 12:37, vincent ranwez wrote:
>>>>>>> 
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> we are using XML biomart query (and a small perl script to launch
>>>>>>>> this
>>>>>>>> XML query) to collect some Ensembl information. I understand that
>>>>>>>> there
>>>>>>>> is distinct versioning of Ensembl and biomart, but I would like to
>>>>>>>> know
>>>>>>>> which Ensembl version is queried when using a XML query and  how to
>>>>>>>> query peculiar version of Ensembl. It seems to me that my XML
>>>>>>>> queries
>>>>>>>> are done on Ensembl v64, is there a way to query v65 by modifying
>>>>>>>> either the XML file (with the virtualSchemaName attribute ?) or the
>>>>>>>> perl script (provided at the end of this mail).
>>>>>>>> 
>>>>>>>> Biomart web site provides an example to check default configuration
>>>>>>>> :
>>>>>>>> http://www.biomart.org/biomart/martservice?type=configuration&dataset=hsapiens_gene_ensembl
>>>>>>>> but this does only provide the genome version and not the ensembl
>>>>>>>> version. For instance this web page indicate that Homo sapiens
>>>>>>>> genes
>>>>>>>> (GRCh37.p5) is used but this is common to both version 64 and 65 of
>>>>>>>> Ensembl that provide different results for simple query such as the
>>>>>>>> list of human geneId and transcriptId (v64 178,538 results, v65
>>>>>>>> 181,745). Moreover this does not explain how to use a specific
>>>>>>>> configuration pointing toward a given Ensembl release.
>>>>>>>> 
>>>>>>>> I hope you can help me to solve this problem.
>>>>>>>> 
>>>>>>>> sincerely,
>>>>>>>> 
>>>>>>>> Vincent Ranwez
>>>>>>>> 
>>>>>>>> 
>>>>>>>> ###################################
>>>>>>>> perl script use to run XML query files generated via ensembl web
>>>>>>>> interface of biomart
>>>>>>>> ###################################
>>>>>>>> 
>>>>>>>> use strict;
>>>>>>>> use LWP::UserAgent;
>>>>>>>> 
>>>>>>>> 
>>>>>>>> open (FH,"$ARGV[0]") || die ("\nUsage: perl webExample.pl Query.xml
>>>>>>>> outupFile (pb with arg0)\n\n");
>>>>>>>> open (FILE,">>","$ARGV[1]") || die ("\nUsage: perl webExample.pl
>>>>>>>> Query.xml outputFile (pb with arg1)\n\n");
>>>>>>>> close (FILE);
>>>>>>>> 
>>>>>>>> my $xml;
>>>>>>>> while (<FH>){
>>>>>>>> $xml .= $_;
>>>>>>>> }
>>>>>>>> close(FH);
>>>>>>>> 
>>>>>>>> my $path="http://www.biomart.org/biomart/martservice?";
>>>>>>>> my $request =
>>>>>>>> HTTP::Request->new("POST",$path,HTTP::Headers->new(),'query='.$xml."\n");
>>>>>>>> my $ua = LWP::UserAgent->new;
>>>>>>>> 
>>>>>>>> my $response;
>>>>>>>> my $tmp = "$ARGV[1]_tmp";
>>>>>>>> my $fileRes = $ARGV[1];;
>>>>>>>> $ua->request($request, "$tmp");
>>>>>>>> system("cat $tmp >> $fileRes; rm $tmp");
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> Dev mailing list    Dev at ensembl.org
>>>>>>>> List admin (including subscribe/unsubscribe):
>>>>>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>>>>> Ensembl Blog: http://www.ensembl.info/
>>>>>>> 
>>>>>>> Rhoda Kinsella Ph.D.
>>>>>>> Ensembl Production Project Leader,
>>>>>>> European Bioinformatics Institute (EMBL-EBI),
>>>>>>> Wellcome Trust Genome Campus,
>>>>>>> Hinxton
>>>>>>> Cambridge CB10 1SD,
>>>>>>> UK.
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>>> 
>> 
>> 
> 
> 





More information about the Dev mailing list