[ensembl-dev] accessing the tilepath entries programatically

Duarte Molha duartemolha at gmail.com
Tue Jun 30 16:29:00 BST 2015


Thibaut... Could you expand on how I can change my script to make it work
with the new assembly?
I have just realised that the reason I am no getting 60 BAC entries is
because their are only present in GRCh38 and not on the GRCh37

Can you tell me how I can modify my script to work with the new assembly?

I don't seem to understand the projection method you are using.
Here is the relevant part of my script

my $mf_adaptor         = $registry->get_adaptor( 'Human', 'Core',
'MiscFeature' );

open (IN, ,"<", $options->{list})|| die "Could not open
".$options->{list}." for reading \n";
my @input_queries = <IN>;
close IN;

foreach my $query (@input_queries){
chomp $query;
my $clones =  $mf_adaptor->fetch_all_by_attribute_type_value( 'clone_name',
$query );

while ( my $clone = shift @{$clones} ) {
my $slice = $clone->slice();
print join "\t", ("chr".$slice->seq_region_name(), $clone->start(),
$clone->end() , $query."\n");
}
}


Best regards

Duarte

=========================
     Duarte Miguel Paulo Molha
         http://about.me/duarte
=========================

On 30 June 2015 at 15:46, Duarte Molha <duartemolha at gmail.com> wrote:

> no. That does not get anything.
>
>
>
> =========================
>      Duarte Miguel Paulo Molha
>          http://about.me/duarte
> =========================
>
> On 30 June 2015 at 14:50, Thibaut Hourlier <thibaut at ebi.ac.uk> wrote:
>
>> If you use name instead of clone_name, does it fetches the missing one?
>>
>> Cheers
>> Thibaut
>>
>> On 30 Jun 2015, at 14:27, Duarte Molha <duartemolha at gmail.com> wrote:
>>
>> Yes I am using the GRCh37 Thibaut  ... so I am ok for now... but it is
>> good to know this does not work with the latest assembly.
>> However... can you please answer my question regarding the missing clones
>> like  RP11-155D3 ... why can I not fetch this when it is clearly on the
>> database?
>>
>> Thanks
>>
>> Duarte
>>
>>
>>
>> =========================
>>      Duarte Miguel Paulo Molha
>>          http://about.me/duarte
>> =========================
>>
>> On 30 June 2015 at 14:12, Thibaut Hourlier <thibaut at ebi.ac.uk> wrote:
>>
>>> My first question should have been which assembly are you using...
>>>
>>> So yes this will work for GRCh37. Unfortunately it will not work for
>>> GRCh38 but this is something that we will fix for release 82.
>>>
>>> So in the case of GRCh38, it is still possible but more complicated. It
>>> should work by getting the slice then projecting on the clone coordinate
>>> system
>>>
>>> $subSlice = $misc_clone->feature_Slice;
>>> $projectionSegment = $subSlice->project('clone')
>>>
>>> Cheers
>>> Thibaut
>>>
>>> On 30 Jun 2015, at 13:56, Duarte Molha <duartemolha at gmail.com> wrote:
>>>
>>> Nevermind... after searching for miscFeatures information I found the
>>> relevant part in the api tutorial
>>>
>>> Just for reference to anyone that has the same difficulties here is the
>>> relevant portion of the code I used:
>>> (please let me know if there is something I did wrong Thibaut)
>>>
>>> my $mf_adaptor         = $registry->get_adaptor( 'Human', 'Core',
>>> 'MiscFeature' );
>>>
>>> open (IN, ,"<", $options->{list})|| die "Could not open
>>> ".$options->{list}." for reading \n";
>>> my @input_queries = <IN>;
>>> close IN;
>>>
>>> foreach my $query (@input_queries){
>>> chomp $query;
>>> my $clones =  $mf_adaptor->fetch_all_by_attribute_type_value(
>>> 'clone_name', $query );
>>>
>>> while ( my $clone = shift @{$clones} ) {
>>> my $slice = $clone->slice();
>>> print join "\t", ("chr".$slice->seq_region_name(), $clone->start(),
>>> $clone->end() , $query."\n");
>>> }
>>> }
>>>
>>>
>>> Best regards
>>>
>>> Duarte
>>>
>>> =========================
>>>      Duarte Miguel Paulo Molha
>>>          http://about.me/duarte
>>> =========================
>>>
>>> On 30 June 2015 at 13:26, Duarte Molha <duartemolha at gmail.com> wrote:
>>>
>>>> Many thanks Thibaut
>>>>
>>>> So... in regards to your question...
>>>>
>>>> How can I query a specific clone and its correct coordinates if I know
>>>>  the clone ID.
>>>>
>>>> For example
>>>>
>>>> assuming this clone:
>>>>  RP11-100N21
>>>>
>>>> In other words , how to I query the underlying clone dataset and output
>>>> those clones in genomic coordinates?
>>>>
>>>> Many thanks
>>>>
>>>> Duarte
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> =========================
>>>>      Duarte Miguel Paulo Molha
>>>>          http://about.me/duarte
>>>> =========================
>>>>
>>>> On 30 June 2015 at 13:15, Thibaut Hourlier <thibaut at ebi.ac.uk> wrote:
>>>>
>>>>> Hi Duarte,
>>>>> The clone names are stored in the misc_* tables. So you need to use
>>>>> the MiscFeatureAdaptor,
>>>>> http://www.ensembl.org/info/docs/Doxygen/core-api/classBio_1_1EnsEMBL_1_1DBSQL_1_1MiscFeatureAdaptor.html
>>>>> :
>>>>>
>>>>> my $misc_clones = $mfa->fetch_all_by_Slice_and_set_code('tilepath');
>>>>> foreach my $clone (@$misc_clones) {
>>>>>  print join("\t", $clone->slice->seq_region_name, $clone->start,
>>>>> $clone->end, @{$clone->get_all_attribute_values('name')}), "\n";
>>>>> }
>>>>>
>>>>> A warning though, this is the tilepath so the boundaries of the clones
>>>>> are different from the contigs/clones in the assembly as sometimes they
>>>>> didn't use the entire clone for the assembly
>>>>>
>>>>> Hope this help
>>>>>
>>>>> Thibaut
>>>>>
>>>>> > On 30 Jun 2015, at 11:50, Duarte Molha <duartemolha at gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > I used this code to get all the gebnomic coordinates of your
>>>>> subcontigs:
>>>>> >
>>>>> >
>>>>> > my @slices = @{ $slice_adaptor->fetch_all('clone') };
>>>>> > foreach my $slice (@slices){
>>>>> >       $progress->update();
>>>>> >       my $clone_name =  $slice->seq_region_name();
>>>>> >       my $projection = $slice->project('toplevel');
>>>>> >       foreach my $segment ( @{$projection} ) {
>>>>> >               my $to_slice = $segment->to_Slice();
>>>>> >               print join "\t", ("chr".$to_slice->seq_region_name(),
>>>>> $to_slice->start(), $to_slice->end(), $clone_name."\n");
>>>>> >       }
>>>>> > }
>>>>> >
>>>>> > However, by doing this, the database does not fetch the original
>>>>> clone name
>>>>> >
>>>>> > for example.. using this script I get
>>>>> > chr4    47567235        47733411        AC092597.1
>>>>> >
>>>>> > However I would like to get :
>>>>> >
>>>>> > chr4    47567235        47733411        RP11-100N21
>>>>> >
>>>>> > Can someone explain what I am doing wrong?
>>>>> >
>>>>> > Thanks
>>>>> >
>>>>> > Duarte
>>>>> >
>>>>> >
>>>>> >
>>>>> > =========================
>>>>> >      Duarte Miguel Paulo Molha
>>>>> >          http://about.me/duarte
>>>>> > =========================
>>>>> >
>>>>> > On 30 June 2015 at 09:45, Duarte Molha <duartemolha at gmail.com>
>>>>> wrote:
>>>>> > Dear devs
>>>>> >
>>>>> > How can I search for a specific clone id present on your tilepath
>>>>> >
>>>>> > for example this: RP5-892C22
>>>>> >
>>>>> > I would like to use the perl API if possible
>>>>> >
>>>>> > Many thanks
>>>>> >
>>>>> > Duarte
>>>>> >
>>>>> >
>>>>> >
>>>>> > =========================
>>>>> >      Duarte Miguel Paulo Molha
>>>>> >          http://about.me/duarte
>>>>> > =========================
>>>>> >
>>>>> > _______________________________________________
>>>>> > Dev mailing list    Dev at ensembl.org
>>>>> > Posting guidelines and subscribe/unsubscribe info:
>>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>> > Ensembl Blog: http://www.ensembl.info/
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Dev mailing list    Dev at ensembl.org
>>>>> Posting guidelines and subscribe/unsubscribe info:
>>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>> Ensembl Blog: http://www.ensembl.info/
>>>>>
>>>>
>>>>
>>> _______________________________________________
>>> Dev mailing list    Dev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info:
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>>>
>>>
>>>
>>> _______________________________________________
>>> Dev mailing list    Dev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info:
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>>>
>>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150630/ce0e57da/attachment.html>


More information about the Dev mailing list