[ensembl-dev] accessing the tilepath entries programatically

Duarte Molha duartemolha at gmail.com
Wed Jul 1 18:15:28 BST 2015


I would still appreciate some help with this query. If possible.
On 30 Jun 2015 16:29, "Duarte Molha" <duartemolha at gmail.com> wrote:

> Thibaut... Could you expand on how I can change my script to make it work
> with the new assembly?
> I have just realised that the reason I am no getting 60 BAC entries is
> because their are only present in GRCh38 and not on the GRCh37
>
> Can you tell me how I can modify my script to work with the new assembly?
>
> I don't seem to understand the projection method you are using.
> Here is the relevant part of my script
>
> my $mf_adaptor         = $registry->get_adaptor( 'Human', 'Core',
> 'MiscFeature' );
>
> open (IN, ,"<", $options->{list})|| die "Could not open
> ".$options->{list}." for reading \n";
> my @input_queries = <IN>;
> close IN;
>
> foreach my $query (@input_queries){
> chomp $query;
> my $clones =  $mf_adaptor->fetch_all_by_attribute_type_value(
> 'clone_name', $query );
>
> while ( my $clone = shift @{$clones} ) {
> my $slice = $clone->slice();
> print join "\t", ("chr".$slice->seq_region_name(), $clone->start(),
> $clone->end() , $query."\n");
> }
> }
>
>
> Best regards
>
> Duarte
>
> =========================
>      Duarte Miguel Paulo Molha
>          http://about.me/duarte
> =========================
>
> On 30 June 2015 at 15:46, Duarte Molha <duartemolha at gmail.com> wrote:
>
>> no. That does not get anything.
>>
>>
>>
>> =========================
>>      Duarte Miguel Paulo Molha
>>          http://about.me/duarte
>> =========================
>>
>> On 30 June 2015 at 14:50, Thibaut Hourlier <thibaut at ebi.ac.uk> wrote:
>>
>>> If you use name instead of clone_name, does it fetches the missing one?
>>>
>>> Cheers
>>> Thibaut
>>>
>>> On 30 Jun 2015, at 14:27, Duarte Molha <duartemolha at gmail.com> wrote:
>>>
>>> Yes I am using the GRCh37 Thibaut  ... so I am ok for now... but it is
>>> good to know this does not work with the latest assembly.
>>> However... can you please answer my question regarding the missing
>>> clones like  RP11-155D3 ... why can I not fetch this when it is clearly
>>> on the database?
>>>
>>> Thanks
>>>
>>> Duarte
>>>
>>>
>>>
>>> =========================
>>>      Duarte Miguel Paulo Molha
>>>          http://about.me/duarte
>>> =========================
>>>
>>> On 30 June 2015 at 14:12, Thibaut Hourlier <thibaut at ebi.ac.uk> wrote:
>>>
>>>> My first question should have been which assembly are you using...
>>>>
>>>> So yes this will work for GRCh37. Unfortunately it will not work for
>>>> GRCh38 but this is something that we will fix for release 82.
>>>>
>>>> So in the case of GRCh38, it is still possible but more complicated. It
>>>> should work by getting the slice then projecting on the clone coordinate
>>>> system
>>>>
>>>> $subSlice = $misc_clone->feature_Slice;
>>>> $projectionSegment = $subSlice->project('clone')
>>>>
>>>> Cheers
>>>> Thibaut
>>>>
>>>> On 30 Jun 2015, at 13:56, Duarte Molha <duartemolha at gmail.com> wrote:
>>>>
>>>> Nevermind... after searching for miscFeatures information I found the
>>>> relevant part in the api tutorial
>>>>
>>>> Just for reference to anyone that has the same difficulties here is the
>>>> relevant portion of the code I used:
>>>> (please let me know if there is something I did wrong Thibaut)
>>>>
>>>> my $mf_adaptor         = $registry->get_adaptor( 'Human', 'Core',
>>>> 'MiscFeature' );
>>>>
>>>> open (IN, ,"<", $options->{list})|| die "Could not open
>>>> ".$options->{list}." for reading \n";
>>>> my @input_queries = <IN>;
>>>> close IN;
>>>>
>>>> foreach my $query (@input_queries){
>>>> chomp $query;
>>>> my $clones =  $mf_adaptor->fetch_all_by_attribute_type_value(
>>>> 'clone_name', $query );
>>>>
>>>> while ( my $clone = shift @{$clones} ) {
>>>> my $slice = $clone->slice();
>>>> print join "\t", ("chr".$slice->seq_region_name(), $clone->start(),
>>>> $clone->end() , $query."\n");
>>>> }
>>>> }
>>>>
>>>>
>>>> Best regards
>>>>
>>>> Duarte
>>>>
>>>> =========================
>>>>      Duarte Miguel Paulo Molha
>>>>          http://about.me/duarte
>>>> =========================
>>>>
>>>> On 30 June 2015 at 13:26, Duarte Molha <duartemolha at gmail.com> wrote:
>>>>
>>>>> Many thanks Thibaut
>>>>>
>>>>> So... in regards to your question...
>>>>>
>>>>> How can I query a specific clone and its correct coordinates if I know
>>>>>  the clone ID.
>>>>>
>>>>> For example
>>>>>
>>>>> assuming this clone:
>>>>>  RP11-100N21
>>>>>
>>>>> In other words , how to I query the underlying clone dataset and
>>>>> output those clones in genomic coordinates?
>>>>>
>>>>> Many thanks
>>>>>
>>>>> Duarte
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> =========================
>>>>>      Duarte Miguel Paulo Molha
>>>>>          http://about.me/duarte
>>>>> =========================
>>>>>
>>>>> On 30 June 2015 at 13:15, Thibaut Hourlier <thibaut at ebi.ac.uk> wrote:
>>>>>
>>>>>> Hi Duarte,
>>>>>> The clone names are stored in the misc_* tables. So you need to use
>>>>>> the MiscFeatureAdaptor,
>>>>>> http://www.ensembl.org/info/docs/Doxygen/core-api/classBio_1_1EnsEMBL_1_1DBSQL_1_1MiscFeatureAdaptor.html
>>>>>> :
>>>>>>
>>>>>> my $misc_clones = $mfa->fetch_all_by_Slice_and_set_code('tilepath');
>>>>>> foreach my $clone (@$misc_clones) {
>>>>>>  print join("\t", $clone->slice->seq_region_name, $clone->start,
>>>>>> $clone->end, @{$clone->get_all_attribute_values('name')}), "\n";
>>>>>> }
>>>>>>
>>>>>> A warning though, this is the tilepath so the boundaries of the
>>>>>> clones are different from the contigs/clones in the assembly as sometimes
>>>>>> they didn't use the entire clone for the assembly
>>>>>>
>>>>>> Hope this help
>>>>>>
>>>>>> Thibaut
>>>>>>
>>>>>> > On 30 Jun 2015, at 11:50, Duarte Molha <duartemolha at gmail.com>
>>>>>> wrote:
>>>>>> >
>>>>>> > I used this code to get all the gebnomic coordinates of your
>>>>>> subcontigs:
>>>>>> >
>>>>>> >
>>>>>> > my @slices = @{ $slice_adaptor->fetch_all('clone') };
>>>>>> > foreach my $slice (@slices){
>>>>>> >       $progress->update();
>>>>>> >       my $clone_name =  $slice->seq_region_name();
>>>>>> >       my $projection = $slice->project('toplevel');
>>>>>> >       foreach my $segment ( @{$projection} ) {
>>>>>> >               my $to_slice = $segment->to_Slice();
>>>>>> >               print join "\t", ("chr".$to_slice->seq_region_name(),
>>>>>> $to_slice->start(), $to_slice->end(), $clone_name."\n");
>>>>>> >       }
>>>>>> > }
>>>>>> >
>>>>>> > However, by doing this, the database does not fetch the original
>>>>>> clone name
>>>>>> >
>>>>>> > for example.. using this script I get
>>>>>> > chr4    47567235        47733411        AC092597.1
>>>>>> >
>>>>>> > However I would like to get :
>>>>>> >
>>>>>> > chr4    47567235        47733411        RP11-100N21
>>>>>> >
>>>>>> > Can someone explain what I am doing wrong?
>>>>>> >
>>>>>> > Thanks
>>>>>> >
>>>>>> > Duarte
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > =========================
>>>>>> >      Duarte Miguel Paulo Molha
>>>>>> >          http://about.me/duarte
>>>>>> > =========================
>>>>>> >
>>>>>> > On 30 June 2015 at 09:45, Duarte Molha <duartemolha at gmail.com>
>>>>>> wrote:
>>>>>> > Dear devs
>>>>>> >
>>>>>> > How can I search for a specific clone id present on your tilepath
>>>>>> >
>>>>>> > for example this: RP5-892C22
>>>>>> >
>>>>>> > I would like to use the perl API if possible
>>>>>> >
>>>>>> > Many thanks
>>>>>> >
>>>>>> > Duarte
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > =========================
>>>>>> >      Duarte Miguel Paulo Molha
>>>>>> >          http://about.me/duarte
>>>>>> > =========================
>>>>>> >
>>>>>> > _______________________________________________
>>>>>> > Dev mailing list    Dev at ensembl.org
>>>>>> > Posting guidelines and subscribe/unsubscribe info:
>>>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>>> > Ensembl Blog: http://www.ensembl.info/
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Dev mailing list    Dev at ensembl.org
>>>>>> Posting guidelines and subscribe/unsubscribe info:
>>>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>>> Ensembl Blog: http://www.ensembl.info/
>>>>>>
>>>>>
>>>>>
>>>> _______________________________________________
>>>> Dev mailing list    Dev at ensembl.org
>>>> Posting guidelines and subscribe/unsubscribe info:
>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>> Ensembl Blog: http://www.ensembl.info/
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Dev mailing list    Dev at ensembl.org
>>>> Posting guidelines and subscribe/unsubscribe info:
>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>> Ensembl Blog: http://www.ensembl.info/
>>>>
>>>>
>>> _______________________________________________
>>> Dev mailing list    Dev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info:
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>>>
>>>
>>>
>>> _______________________________________________
>>> Dev mailing list    Dev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info:
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150701/a5de09d5/attachment.html>


More information about the Dev mailing list