[ensembl-dev] accessing the tilepath entries programatically
Duarte Molha
duartemolha at gmail.com
Tue Jun 30 16:29:00 BST 2015
Thibaut... Could you expand on how I can change my script to make it work
with the new assembly?
I have just realised that the reason I am no getting 60 BAC entries is
because their are only present in GRCh38 and not on the GRCh37
Can you tell me how I can modify my script to work with the new assembly?
I don't seem to understand the projection method you are using.
Here is the relevant part of my script
my $mf_adaptor = $registry->get_adaptor( 'Human', 'Core',
'MiscFeature' );
open (IN, ,"<", $options->{list})|| die "Could not open
".$options->{list}." for reading \n";
my @input_queries = <IN>;
close IN;
foreach my $query (@input_queries){
chomp $query;
my $clones = $mf_adaptor->fetch_all_by_attribute_type_value( 'clone_name',
$query );
while ( my $clone = shift @{$clones} ) {
my $slice = $clone->slice();
print join "\t", ("chr".$slice->seq_region_name(), $clone->start(),
$clone->end() , $query."\n");
}
}
Best regards
Duarte
=========================
Duarte Miguel Paulo Molha
http://about.me/duarte
=========================
On 30 June 2015 at 15:46, Duarte Molha <duartemolha at gmail.com> wrote:
> no. That does not get anything.
>
>
>
> =========================
> Duarte Miguel Paulo Molha
> http://about.me/duarte
> =========================
>
> On 30 June 2015 at 14:50, Thibaut Hourlier <thibaut at ebi.ac.uk> wrote:
>
>> If you use name instead of clone_name, does it fetches the missing one?
>>
>> Cheers
>> Thibaut
>>
>> On 30 Jun 2015, at 14:27, Duarte Molha <duartemolha at gmail.com> wrote:
>>
>> Yes I am using the GRCh37 Thibaut ... so I am ok for now... but it is
>> good to know this does not work with the latest assembly.
>> However... can you please answer my question regarding the missing clones
>> like RP11-155D3 ... why can I not fetch this when it is clearly on the
>> database?
>>
>> Thanks
>>
>> Duarte
>>
>>
>>
>> =========================
>> Duarte Miguel Paulo Molha
>> http://about.me/duarte
>> =========================
>>
>> On 30 June 2015 at 14:12, Thibaut Hourlier <thibaut at ebi.ac.uk> wrote:
>>
>>> My first question should have been which assembly are you using...
>>>
>>> So yes this will work for GRCh37. Unfortunately it will not work for
>>> GRCh38 but this is something that we will fix for release 82.
>>>
>>> So in the case of GRCh38, it is still possible but more complicated. It
>>> should work by getting the slice then projecting on the clone coordinate
>>> system
>>>
>>> $subSlice = $misc_clone->feature_Slice;
>>> $projectionSegment = $subSlice->project('clone')
>>>
>>> Cheers
>>> Thibaut
>>>
>>> On 30 Jun 2015, at 13:56, Duarte Molha <duartemolha at gmail.com> wrote:
>>>
>>> Nevermind... after searching for miscFeatures information I found the
>>> relevant part in the api tutorial
>>>
>>> Just for reference to anyone that has the same difficulties here is the
>>> relevant portion of the code I used:
>>> (please let me know if there is something I did wrong Thibaut)
>>>
>>> my $mf_adaptor = $registry->get_adaptor( 'Human', 'Core',
>>> 'MiscFeature' );
>>>
>>> open (IN, ,"<", $options->{list})|| die "Could not open
>>> ".$options->{list}." for reading \n";
>>> my @input_queries = <IN>;
>>> close IN;
>>>
>>> foreach my $query (@input_queries){
>>> chomp $query;
>>> my $clones = $mf_adaptor->fetch_all_by_attribute_type_value(
>>> 'clone_name', $query );
>>>
>>> while ( my $clone = shift @{$clones} ) {
>>> my $slice = $clone->slice();
>>> print join "\t", ("chr".$slice->seq_region_name(), $clone->start(),
>>> $clone->end() , $query."\n");
>>> }
>>> }
>>>
>>>
>>> Best regards
>>>
>>> Duarte
>>>
>>> =========================
>>> Duarte Miguel Paulo Molha
>>> http://about.me/duarte
>>> =========================
>>>
>>> On 30 June 2015 at 13:26, Duarte Molha <duartemolha at gmail.com> wrote:
>>>
>>>> Many thanks Thibaut
>>>>
>>>> So... in regards to your question...
>>>>
>>>> How can I query a specific clone and its correct coordinates if I know
>>>> the clone ID.
>>>>
>>>> For example
>>>>
>>>> assuming this clone:
>>>> RP11-100N21
>>>>
>>>> In other words , how to I query the underlying clone dataset and output
>>>> those clones in genomic coordinates?
>>>>
>>>> Many thanks
>>>>
>>>> Duarte
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> =========================
>>>> Duarte Miguel Paulo Molha
>>>> http://about.me/duarte
>>>> =========================
>>>>
>>>> On 30 June 2015 at 13:15, Thibaut Hourlier <thibaut at ebi.ac.uk> wrote:
>>>>
>>>>> Hi Duarte,
>>>>> The clone names are stored in the misc_* tables. So you need to use
>>>>> the MiscFeatureAdaptor,
>>>>> http://www.ensembl.org/info/docs/Doxygen/core-api/classBio_1_1EnsEMBL_1_1DBSQL_1_1MiscFeatureAdaptor.html
>>>>> :
>>>>>
>>>>> my $misc_clones = $mfa->fetch_all_by_Slice_and_set_code('tilepath');
>>>>> foreach my $clone (@$misc_clones) {
>>>>> print join("\t", $clone->slice->seq_region_name, $clone->start,
>>>>> $clone->end, @{$clone->get_all_attribute_values('name')}), "\n";
>>>>> }
>>>>>
>>>>> A warning though, this is the tilepath so the boundaries of the clones
>>>>> are different from the contigs/clones in the assembly as sometimes they
>>>>> didn't use the entire clone for the assembly
>>>>>
>>>>> Hope this help
>>>>>
>>>>> Thibaut
>>>>>
>>>>> > On 30 Jun 2015, at 11:50, Duarte Molha <duartemolha at gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > I used this code to get all the gebnomic coordinates of your
>>>>> subcontigs:
>>>>> >
>>>>> >
>>>>> > my @slices = @{ $slice_adaptor->fetch_all('clone') };
>>>>> > foreach my $slice (@slices){
>>>>> > $progress->update();
>>>>> > my $clone_name = $slice->seq_region_name();
>>>>> > my $projection = $slice->project('toplevel');
>>>>> > foreach my $segment ( @{$projection} ) {
>>>>> > my $to_slice = $segment->to_Slice();
>>>>> > print join "\t", ("chr".$to_slice->seq_region_name(),
>>>>> $to_slice->start(), $to_slice->end(), $clone_name."\n");
>>>>> > }
>>>>> > }
>>>>> >
>>>>> > However, by doing this, the database does not fetch the original
>>>>> clone name
>>>>> >
>>>>> > for example.. using this script I get
>>>>> > chr4 47567235 47733411 AC092597.1
>>>>> >
>>>>> > However I would like to get :
>>>>> >
>>>>> > chr4 47567235 47733411 RP11-100N21
>>>>> >
>>>>> > Can someone explain what I am doing wrong?
>>>>> >
>>>>> > Thanks
>>>>> >
>>>>> > Duarte
>>>>> >
>>>>> >
>>>>> >
>>>>> > =========================
>>>>> > Duarte Miguel Paulo Molha
>>>>> > http://about.me/duarte
>>>>> > =========================
>>>>> >
>>>>> > On 30 June 2015 at 09:45, Duarte Molha <duartemolha at gmail.com>
>>>>> wrote:
>>>>> > Dear devs
>>>>> >
>>>>> > How can I search for a specific clone id present on your tilepath
>>>>> >
>>>>> > for example this: RP5-892C22
>>>>> >
>>>>> > I would like to use the perl API if possible
>>>>> >
>>>>> > Many thanks
>>>>> >
>>>>> > Duarte
>>>>> >
>>>>> >
>>>>> >
>>>>> > =========================
>>>>> > Duarte Miguel Paulo Molha
>>>>> > http://about.me/duarte
>>>>> > =========================
>>>>> >
>>>>> > _______________________________________________
>>>>> > Dev mailing list Dev at ensembl.org
>>>>> > Posting guidelines and subscribe/unsubscribe info:
>>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>> > Ensembl Blog: http://www.ensembl.info/
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Dev mailing list Dev at ensembl.org
>>>>> Posting guidelines and subscribe/unsubscribe info:
>>>>> http://lists.ensembl.org/mailman/listinfo/dev
>>>>> Ensembl Blog: http://www.ensembl.info/
>>>>>
>>>>
>>>>
>>> _______________________________________________
>>> Dev mailing list Dev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info:
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>>>
>>>
>>>
>>> _______________________________________________
>>> Dev mailing list Dev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info:
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>>>
>>>
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>>
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150630/ce0e57da/attachment.html>
More information about the Dev
mailing list