[ensembl-dev] Gata1

Ian Dunham dunham at ebi.ac.uk
Tue Aug 9 09:55:53 BST 2011


Two points:

1. You can also get at any annotated features in the database (including
those not in the regulatory build) using the code that is given in the
tutorial.

my $fset_adaptor = $registry->get_adaptor("Homo sapiens", 'funcgen',
'featureset');

my @reg_fsets = @{$fset_adaptor->fetch_all_by_type('annotated')};

foreach my $reg_fset (@reg_fsets) {
	print $reg_fset->name."\n";
	#Regulatory Feature Sets
	print $reg_fset->feature_class."\n";
	#The Regulatory Build
	print $reg_fset->analysis->logic_name."\n";
	#Regulatory Feature Type
	print $reg_fset->feature_type->name."\n";
	#Regulatory Feature Sets have Cell Type defined
	print $reg_fset->cell_type->name."\n";
}

Obviously you can match $reg_fset->name to Gata1 in this case.  This is
perhaps a more general way if one want to get all the datasets
regardless of the regulatory build.

2. Mainly for Daniel: ENCODE recently resolved the K562b issue (and the
other *b cell lines).  Turns out that the distinction between K562 and
K562b was largely the result of a misunderstanding between production
groups and the data coordination centre, and in fact there is no
difference in the cells.  So K562b designations will now be treated as
K562.  The original false distinction lead to all sorts of other
divisions between the data, including K562b data not being analysed in
various stages. Thankfully we can now ignore that, and we should update
the database to reflect this - maybe too late now for e64 but should be
for e65. I apologise for not passing this on earlier - it slipped my
mind with all the other ENCODE related activity.

Cheers
Ian

Ian Dunham M.A. D.Phil

European Bioinformatics Institute (EBI)
Wellcome Trust Genome Campus, Hinxton
Cambridge, CB10 1SD, UK
Tel:  01223 492636  FAX:  01223 494468
dunham at ebi.ac.uk

http://www.ensembl.info/
http://twitter.com/#!/ensembl


On 08/08/2011 18:50, Daniel Sobral wrote:
> Hi Thomas,
> 
> We have one dataset for Gata1 from ENCODE in Ensembl. We have initially
> included this set, as it is for K562, but we later excluded it from our
> regulatory build for K562, because it is using an alternative K562
> cell-line (K562b) different from the other K562. You can see more
> details on the ENCODE cell-lines here:
> http://genome.ucsc.edu/ENCODE/cellTypes.html
> 
> Since we do not include it in our regulatory build, you cannot access
> the Gata1 data the standard way (from the Regulatory Features).
> But the data is still valid, and we kept it in the database, so you can
> access it directy via its set name.
> (you can have a list of all available set names from the featureset
> adaptor)
> 
> Here's some example code to access Gata1 sites. Be aware nonetheless
> that datasets not included in the regulatory build are considered
> deprecated and may be removed / changed in the future.
> 
> my $fset_adaptor = $registry->get_adaptor('Homo sapiens', 'funcgen',
> 'featureset');
> 
> my $gata1_fset =
> $fset_adaptor->fetch_by_name("K562b_Gata1_ENCODE_Yale_SWEMBL_R015");
> 
> @gata1_feats = @{$gata1_fset->get_Features_by_Slice($slice)};
> 
> foreach my $gata_feat (@gata1_feats){
>     my @motifs = @{$gata_feat->get_associated_MotifFeatures()};
>     foreach my $motif (@motifs){
>         print
> $motif->seq_region_name."\t".$motif->seq_region_start."\t".$motif->seq_region_end."\t".
> 
>            
> $motif->seq_region_strand."\t".$motif->score."\t".$motif->binding_matrix->name."\n";
> 
>     }
> }
> 
> 
> Hope it helps,
> Daniel
> 
> On 08/08/2011 09:35, David Thomas wrote:
>> Hi
>>
>> I'm trying to extract specific TFBS information using the funcgen API.
>> I can find most of the information I need but nothing on Gata1.
>>
>> I drill down as follows:
>>
>> @reg_feats = @{$regfeat_adaptor->fetch_all_by_Slice($slice)};
>> foreach my $rf (@reg_feats){
>>     …..   
>>     foreach my $feature (@{$rf->regulatory_attributes()}){
>>         ….
>>         my @motif_features = @{$rf->regulatory_attributes('motif')};
>>         foreach my $motif_feature (@motif_features) {
>>             ……
>>             my $afs = $motif_feature->associated_annotated_features();   
>>              foreach my $feat (@$afs){
>>                 .....
>>
>> Looking at the displaylabels in each of the levels I can find no
>> information on Gata1.
>>
>> Apologies if this is a naive question but any help gratefully received.
>>
>> David
>>
>>
>> _______________________________________________
>> Dev mailing listDev at ensembl.org
>> List admin (including
>> subscribe/unsubscribe):http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog:http://www.ensembl.info/
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> List admin (including subscribe/unsubscribe):
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
> 





More information about the Dev mailing list