[ensembl-dev] error retrieving annotated features from Ensembl Regulation 61

Rhoda Kinsella rhoda at ebi.ac.uk
Wed Apr 13 11:00:41 BST 2011


Hi Fiona
Unfortunately these NULL rows are a side effect of the way we are  
currently forced to create the regulation mart due to the complexity  
of the regulation schema and BioMart software limitations. I agree  
that we need to make users more aware that they need to select a  
filter and we will work to make any improvements we can for release  
63. Many thanks for you feedback about this.
Regards
Rhoda


On 13 Apr 2011, at 10:40, Fiona Nielsen wrote:

> Hi Nathan,
>
> In BioMart web interface:
> - The default behaviour I see when making a query to the Regulatory
> Features, when not selecting any filters, is to return all rows, e.g.
> all regulatory features.
> - When making a new query choosing Annotated Features, not selecting
> any filters, you get a set of empty rows back.
>
> This is what I found very confusing and I wonder why it was made like
> this. If the Annotated Features cannot be queried without setting any
> filter, it would be nice with at least a warning message informing
> about this.
>
> Thanks,
>
> -Fiona-
>
>
> On Wed, Apr 13, 2011 at 11:33 AM, Nathan Johnson  
> <njohnson at ebi.ac.uk> wrote:
>> I'm not sure how the annotated and reg feat behaviour is  
>> different.  Can you explain?
>>
>> We are actively thinking about how to fix this, so any user input  
>> is very valuable.
>>
>> Thanks
>>
>> Nath
>>
>>
>> On 12 Apr 2011, at 10:44, Fiona Nielsen wrote:
>>
>>> Hi Nathan,
>>>
>>> So the BioMart problem is solved: it was not a bug but a non- 
>>> intuitive
>>> user interface (in my opinion, since the default behaviour is
>>> different for regulatory vs annotated features).
>>>
>>> The API problem:
>>> I re-downloaded the funct-genomics API for version 53 to go with the
>>> rest of my code made for version53, and it fixed the weird error
>>> message from before. I guess I must have messed up the package
>>> versions before, e.g. something like using the 61 version of
>>> funct-genomics with version 53 of the core API.
>>>
>>> Thanks for your help.
>>>
>>> -Fiona-
>>>
>>> On Fri, Apr 8, 2011 at 4:17 PM, Nathan Johnson  
>>> <njohnson at ebi.ac.uk> wrote:
>>>> Hi Fiona
>>>>
>>>> The mart query is not returning anything as you have not selected  
>>>> any feature filters.  You need to pick some from either of the  
>>>> three feature sections, but not from more than one. If you want  
>>>> some annotated features, select your region and the feature sets  
>>>> or feature types you require. There are nearly 300 different sets  
>>>> in there, so be mindful that it may take a while if you use  
>>>> permissive filters.
>>>>
>>>> The API bug is curious.  Can you send me the code which loads the  
>>>> registry?  I suspect you are looking at old DBs with a more  
>>>> recent API.
>>>>
>>>> Nath
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 8 Apr 2011, at 14:43, Fiona Nielsen wrote:
>>>>
>>>>> Hi Nathan,
>>>>>
>>>>> The output from adding '-verbose => 1':
>>>>> "Will only load v53 databases Species 'saccharomyces_cerevisiae'
>>>>> loaded from database 'saccharomyces_cerevisiae_core_53_1i'  ..."  
>>>>> etc
>>>>>
>>>>> Maybe I have a mess-up of files between the 53 and 59 version?
>>>>>
>>>>>
>>>>> To reproduce the result of empty rows from BioMart:
>>>>> 1. make a new query
>>>>> 2. Choose Ensembl Regulation 61
>>>>> 3. Choose Homo Sapiens
>>>>> (3a. click Attributes to see that Annotated Features is  
>>>>> selected, but
>>>>> dont change anything)
>>>>> 4. Click 'Results'
>>>>>
>>>>> The left hand panel from BioMart that returns empty rows:
>>>>> Dataset
>>>>> Homo sapiens features (GRCh37.p2)
>>>>> Homo sapiens features (GRCh37.p2)
>>>>> Filters
>>>>> [None selected]
>>>>> Attributes
>>>>> Feature Set
>>>>> Feature Type
>>>>> Chromosome Name
>>>>> Start (bp)
>>>>> End (bp)
>>>>> Cell Type
>>>>> Feature Type
>>>>> Chromosome Name
>>>>> Start (bp)
>>>>> End (bp)
>>>>> Regulatory Stable ID
>>>>> Cell Type
>>>>> [6 enabled][6 enabled]
>>>>> Feature Set
>>>>> Feature Type
>>>>> Display Label
>>>>> Chromosome Name
>>>>> Start (bp)
>>>>> End (bp)
>>>>> [6 enabled][6 enabled]
>>>>>
>>>>> Thanks,
>>>>>
>>>>> -Fiona-
>>>>>
>>>>>
>>>>> On Fri, Apr 8, 2011 at 3:27 PM, Nathan Johnson  
>>>>> <njohnson at ebi.ac.uk> wrote:
>>>>>> We added gender support at the start of last year, for v60 I  
>>>>>> think.
>>>>>>
>>>>>> I'm not sure you are using the v53 API.  As I have just rerun a  
>>>>>> snippet from you code using v53 and it gives a different error,  
>>>>>> as the fetch_all_by_feature_class method was added after v53  
>>>>>> (v59 I think).
>>>>>>
>>>>>> To find out add '-verbose => 1', to your registry load method.  
>>>>>> This will list the DBs being loaded, which should match your  
>>>>>> API version, if you have not over-ridden it.
>>>>>>
>>>>>>
>>>>>> wrt the biomart query. The link doesn't work for me. Can you  
>>>>>> copy and paste the filter and attributes from the left hand  
>>>>>> panel?
>>>>>>
>>>>>> I suspect you are trying to select filters from more than one  
>>>>>> 'Feature' section, which will (at present) always return no  
>>>>>> data. This is because the mart filters use 'AND' logic e.g.
>>>>>>        'RegulatoryFeature filter' AND 'AnnotatedFeature filter'.
>>>>>>
>>>>>> As any one result cannot be both a RegulatoryFeature and an  
>>>>>> AnnotatedFeature, there will be no results.  We have put in a  
>>>>>> request to change this with the Biomart developers, so  
>>>>>> hopefully we can make this a little easier to use in future.
>>>>>>
>>>>>> Nath
>>>>>>
>>>>>>
>>>>>> On 8 Apr 2011, at 13:48, Fiona Nielsen wrote:
>>>>>>
>>>>>>> Hi Nathan,
>>>>>>>
>>>>>>> My API version is not up-to-date:
>>>>>>> The example was run using the Ensembl 53 API and the DB at  
>>>>>>> ensembldb.ensembl.org
>>>>>>> I will try to run it again after updating all the API packages.
>>>>>>>
>>>>>>> Can you see from what version these functions are available?
>>>>>>>
>>>>>>> and do you have an explanation for the missing data in BioMart?
>>>>>>> http://www.ensembl.org/biomart/martview/e809b93c2b06f0bca4dfda066952fa56/e809b93c2b06f0bca4dfda066952fa56?VIRTUALSCHEMANAME=default&ATTRIBUTES=hsapiens_feature_set.default.annotated_feature.fs_display_label_1048 
>>>>>>> | 
>>>>>>> hsapiens_feature_set.default.annotated_feature.feature_type_name_1048|hsapiens_feature_set.default.annotated_feature.seq_region_name_1048|hsapiens_feature_set.default.annotated_feature.seq_region_start_1048|hsapiens_feature_set.default.annotated_feature.seq_region_end_1048|hsapiens_feature_set.default.annotated_feature.cell_type_name_1048|hsapiens_feature_set.default.annotated_feature.feature_type_class_1048|hsapiens_feature_set.default.annotated_feature.feature_type_description_1048|hsapiens_feature_set.default.annotated_feature.cell_type_display_label_1048|hsapiens_feature_set.default.annotated_feature.cell_type_description_1048&FILTERS=&VISIBLEPANEL=resultspanel
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> -Fiona-
>>>>>>>
>>>>>>> On Fri, Apr 8, 2011 at 2:32 PM, Nathan Johnson <njohnson at ebi.ac.uk 
>>>>>>> > wrote:
>>>>>>>> Hello Fiona
>>>>>>>>
>>>>>>>> I have tested this and can't recreate the error.  I suspect  
>>>>>>>> this maybe an API version issue, as the gender attribute was  
>>>>>>>> added in the last 12 months I think.
>>>>>>>>
>>>>>>>> What version API and DBs are you using?
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>>
>>>>>>>> Nath
>>>>>>>>
>>>>>>>>
>>>>>>>> On 8 Apr 2011, at 12:47, Fiona Nielsen wrote:
>>>>>>>>
>>>>>>>>> I am trying to extract Ensembl data related to the annotated  
>>>>>>>>> features
>>>>>>>>> in the Ensembl Regulation database.
>>>>>>>>>
>>>>>>>>> The function to access regulatory features is executed with  
>>>>>>>>> no error:
>>>>>>>>>
>>>>>>>>> # set database adaptor
>>>>>>>>> my $efg_db       = $registry->get_DBAdaptor('Human',  
>>>>>>>>> 'funcgen');
>>>>>>>>>
>>>>>>>>> # retrieve feature set from database
>>>>>>>>> my $fset_adaptor = $registry- 
>>>>>>>>> >get_adaptor('Human','funcgen','featureset');
>>>>>>>>>
>>>>>>>>> # retrieve all regulatory FeatureSets
>>>>>>>>> # 'annotated', 'regulatory' or 'supporting'
>>>>>>>>> my @rf_fsets = @{$fset_adaptor- 
>>>>>>>>> >fetch_all_by_feature_class('regulatory')};
>>>>>>>>>
>>>>>>>>> foreach my $rf_fset(@rf_fsets){
>>>>>>>>>    print $rf_fset->name.",";
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> but when trying to access the annotated features, I get an  
>>>>>>>>> error message
>>>>>>>>> # set database adaptor
>>>>>>>>> my $efg_db       = $registry->get_DBAdaptor('Human',  
>>>>>>>>> 'funcgen');
>>>>>>>>>
>>>>>>>>> # retrieve feature set from database
>>>>>>>>> my $fset_adaptor = $registry- 
>>>>>>>>> >get_adaptor('Human','funcgen','featureset');
>>>>>>>>>
>>>>>>>>> # retrieve all regulatory FeatureSets
>>>>>>>>> # 'annotated', 'regulatory' or 'supporting'
>>>>>>>>> my @rf_fsets = @{$fset_adaptor- 
>>>>>>>>> >fetch_all_by_feature_class('annotated')};
>>>>>>>>>
>>>>>>>>> DBD::mysql::st execute failed: Unknown column 'ct.gender' in  
>>>>>>>>> 'field
>>>>>>>>> list' at ../ensembl/modules//Bio/EnsEMBL/DBSQL/ 
>>>>>>>>> BaseAdaptor.pm line
>>>>>>>>> 477.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I tried to have a look in BioMart, and if I in BioMart  
>>>>>>>>> select "Homo
>>>>>>>>> Sapiens features" and "annotated features", I am only served  
>>>>>>>>> empty
>>>>>>>>> rows in the result.
>>>>>>>>> Are these data sets not available or is there some bug in  
>>>>>>>>> the API to
>>>>>>>>> retrieve them?
>>>>>>>>>
>>>>>>>>> Any comments and suggestions are welcome.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> -Fiona-
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Dev mailing list
>>>>>>>>> Dev at ensembl.org
>>>>>>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>>>>>
>>>>>>>> Nathan Johnson
>>>>>>>> Senior Scientific Programmer
>>>>>>>> Ensembl Regulation
>>>>>>>> European Bioinformatics Institute
>>>>>>>> Wellcome Trust Genome Campus
>>>>>>>> Hinxton
>>>>>>>> Cambridge CB10 1SD
>>>>>>>>
>>>>>>>> http://www.ensembl.info/
>>>>>>>> http://twitter.com/#!/ensembl
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>> Nathan Johnson
>>>>>> Senior Scientific Programmer
>>>>>> Ensembl Regulation
>>>>>> European Bioinformatics Institute
>>>>>> Wellcome Trust Genome Campus
>>>>>> Hinxton
>>>>>> Cambridge CB10 1SD
>>>>>>
>>>>>> http://www.ensembl.info/
>>>>>> http://twitter.com/#!/ensembl
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>
>>>> Nathan Johnson
>>>> Senior Scientific Programmer
>>>> Ensembl Regulation
>>>> European Bioinformatics Institute
>>>> Wellcome Trust Genome Campus
>>>> Hinxton
>>>> Cambridge CB10 1SD
>>>>
>>>> http://www.ensembl.info/
>>>> http://twitter.com/#!/ensembl
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>
>> Nathan Johnson
>> Senior Scientific Programmer
>> Ensembl Regulation
>> European Bioinformatics Institute
>> Wellcome Trust Genome Campus
>> Hinxton
>> Cambridge CB10 1SD
>>
>> http://www.ensembl.info/
>> http://twitter.com/#!/ensembl
>>
>>
>>
>>
>>
>>
>>
>
> _______________________________________________
> Dev mailing list
> Dev at ensembl.org
> http://lists.ensembl.org/mailman/listinfo/dev

Rhoda Kinsella Ph.D.
Ensembl Bioinformatician,
European Bioinformatics Institute (EMBL-EBI),
Wellcome Trust Genome Campus,
Hinxton
Cambridge CB10 1SD,
UK.





More information about the Dev mailing list