[ensembl-dev] accessing the tilepath entries programatically

Thibaut Hourlier thibaut at ebi.ac.uk
Tue Jun 30 14:12:02 BST 2015


My first question should have been which assembly are you using...

So yes this will work for GRCh37. Unfortunately it will not work for GRCh38 but this is something that we will fix for release 82.

So in the case of GRCh38, it is still possible but more complicated. It should work by getting the slice then projecting on the clone coordinate system

$subSlice = $misc_clone->feature_Slice;
$projectionSegment = $subSlice->project('clone')

Cheers
Thibaut

> On 30 Jun 2015, at 13:56, Duarte Molha <duartemolha at gmail.com> wrote:
> 
> Nevermind... after searching for miscFeatures information I found the relevant part in the api tutorial
> 
> Just for reference to anyone that has the same difficulties here is the relevant portion of the code I used:
> (please let me know if there is something I did wrong Thibaut)
> 
> my $mf_adaptor         = $registry->get_adaptor( 'Human', 'Core', 'MiscFeature' );
> 
> open (IN, ,"<", $options->{list})|| die "Could not open ".$options->{list}." for reading \n";
> my @input_queries = <IN>;
> close IN;
> 
> foreach my $query (@input_queries){
> 	chomp $query;
> 	my $clones =  $mf_adaptor->fetch_all_by_attribute_type_value( 'clone_name', $query );
> 
> 	while ( my $clone = shift @{$clones} ) {
> 		my $slice = $clone->slice();
> 		print join "\t", ("chr".$slice->seq_region_name(), $clone->start(), $clone->end() , $query."\n"); 
> 	}
> }
> 
> 
> Best regards
> 
> Duarte
> 
> =========================
>      Duarte Miguel Paulo Molha      
>          http://about.me/duarte <http://about.me/duarte>         
> =========================
> 
> On 30 June 2015 at 13:26, Duarte Molha <duartemolha at gmail.com <mailto:duartemolha at gmail.com>> wrote:
> Many thanks Thibaut
> 
> So... in regards to your question...
> 
> How can I query a specific clone and its correct coordinates if I know  the clone ID.
> 
> For example
> 
> assuming this clone:
>  RP11-100N21
> 
> In other words , how to I query the underlying clone dataset and output those clones in genomic coordinates?
> 
> Many thanks
> 
> Duarte
> 
> 
> 
> 
> 
> 
> =========================
>      Duarte Miguel Paulo Molha      
>          http://about.me/duarte <http://about.me/duarte>         
> =========================
> 
> On 30 June 2015 at 13:15, Thibaut Hourlier <thibaut at ebi.ac.uk <mailto:thibaut at ebi.ac.uk>> wrote:
> Hi Duarte,
> The clone names are stored in the misc_* tables. So you need to use the MiscFeatureAdaptor, http://www.ensembl.org/info/docs/Doxygen/core-api/classBio_1_1EnsEMBL_1_1DBSQL_1_1MiscFeatureAdaptor.html <http://www.ensembl.org/info/docs/Doxygen/core-api/classBio_1_1EnsEMBL_1_1DBSQL_1_1MiscFeatureAdaptor.html>:
> 
> my $misc_clones = $mfa->fetch_all_by_Slice_and_set_code('tilepath');
> foreach my $clone (@$misc_clones) {
>  print join("\t", $clone->slice->seq_region_name, $clone->start, $clone->end, @{$clone->get_all_attribute_values('name')}), "\n";
> }
> 
> A warning though, this is the tilepath so the boundaries of the clones are different from the contigs/clones in the assembly as sometimes they didn't use the entire clone for the assembly
> 
> Hope this help
> 
> Thibaut
> 
> > On 30 Jun 2015, at 11:50, Duarte Molha <duartemolha at gmail.com <mailto:duartemolha at gmail.com>> wrote:
> >
> > I used this code to get all the gebnomic coordinates of your subcontigs:
> >
> >
> > my @slices = @{ $slice_adaptor->fetch_all('clone') };
> > foreach my $slice (@slices){
> >       $progress->update();
> >       my $clone_name =  $slice->seq_region_name();
> >       my $projection = $slice->project('toplevel');
> >       foreach my $segment ( @{$projection} ) {
> >               my $to_slice = $segment->to_Slice();
> >               print join "\t", ("chr".$to_slice->seq_region_name(), $to_slice->start(), $to_slice->end(), $clone_name."\n");
> >       }
> > }
> >
> > However, by doing this, the database does not fetch the original clone name
> >
> > for example.. using this script I get
> > chr4    47567235        47733411        AC092597.1
> >
> > However I would like to get :
> >
> > chr4    47567235        47733411        RP11-100N21
> >
> > Can someone explain what I am doing wrong?
> >
> > Thanks
> >
> > Duarte
> >
> >
> >
> > =========================
> >      Duarte Miguel Paulo Molha
> >          http://about.me/duarte <http://about.me/duarte>
> > =========================
> >
> > On 30 June 2015 at 09:45, Duarte Molha <duartemolha at gmail.com <mailto:duartemolha at gmail.com>> wrote:
> > Dear devs
> >
> > How can I search for a specific clone id present on your tilepath
> >
> > for example this: RP5-892C22
> >
> > I would like to use the perl API if possible
> >
> > Many thanks
> >
> > Duarte
> >
> >
> >
> > =========================
> >      Duarte Miguel Paulo Molha
> >          http://about.me/duarte <http://about.me/duarte>
> > =========================
> >
> > _______________________________________________
> > Dev mailing list    Dev at ensembl.org <mailto:Dev at ensembl.org>
> > Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev <http://lists.ensembl.org/mailman/listinfo/dev>
> > Ensembl Blog: http://www.ensembl.info/ <http://www.ensembl.info/>
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org <mailto:Dev at ensembl.org>
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev <http://lists.ensembl.org/mailman/listinfo/dev>
> Ensembl Blog: http://www.ensembl.info/ <http://www.ensembl.info/>
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150630/eb5a9125/attachment.html>


More information about the Dev mailing list