[ensembl-dev] Ensembl API 74: Fetch repeats identified by repeatmasker

swaraj basu projectbasu at gmail.com
Thu Feb 20 17:26:31 GMT 2014


I also want to bring to notice another issue with the annotation of repeats
in* Tetraodon *genome (API 74). Fetching the repeat name, class and type
using the following code



*my @repeats = @{ $slice->get_all_RepeatFeatures() };foreach my $repeat
(@repeats) {*



*    my $consensus = $repeat->repeat_consensus();    my $repeat_name =
$consensus->name();    my $repeat_class = $consensus->repeat_class();    my
$repeat_type = $consensus->repeat_type();*
*}*


I found that the class and type are not defined for *Tetraodon. *For example

*Organism           Name               Class               Type*

*Tetraodon        *    TNDIRS1          Tet_repeat        Tetraodon repeats

Fugu                   DrDIRS1           LTR/DIRS1        LTRs

Thus if I have to classify repeat families in *Tetraodon *I cannot directly
use the information fetched from Ensembl, while for Fugu I can do the same.

-Regards


On Thu, Feb 20, 2014 at 5:58 PM, swaraj basu <projectbasu at gmail.com> wrote:

> Thanks for the useful information.
>
>
> On Thu, Feb 20, 2014 at 5:13 PM, mag <mr6 at ebi.ac.uk> wrote:
>
>>  Hi Swaraj,
>>
>> Unfortunately, the 'software' column for tetraodon is not populated.
>>
>> If you want a more reliable way of finding repeats from repeatmasker, I
>> would recommend using the analysis logic name ($analysis->logic_name)
>> That field is compulsory and any repeat mask based analysis will be named
>> 'repeatmask%'
>>
>>
>> Hope that helps,
>> Magali
>>
>>
>> On 20/02/2014 15:53, swaraj basu wrote:
>>
>>  Dear All,
>>
>>  I want to fetch coordinates of repeat elements identified by the repeat
>> masker program in multiple species (BED format). I am using the following
>> code to get the desirable results
>>
>> *my $slice = $slice_adaptor->fetch_by_region( $csn, $srn ); *#csn AND
>> srn ARE PREDEFINED BY ME
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> * my @repeats = @{ $slice->get_all_RepeatFeatures() }; foreach my $repeat
>> (@repeats) {     my $id = $repeat->display_id();     my $start =
>> $repeat->start();     my $end = $repeat->end();     my $strand =
>> $repeat->strand();     my $score = $repeat->score();     my $name =
>> $csn.$srn;     my $analysis = $repeat->analysis();     my $program =
>> $analysis->program();     next unless $program eq "RepeatMasker";     print
>> "$name\t$start\t$end\t$id\t$score\t$strand\n"; }*
>>
>>  My code is fetching me results for human, mouse, zebrafish, cavefish
>> but for *Tetraodon*, the $analysis->program() scalar remains undefined.
>> Hence I am unable to extract the RepeatMasker predictions on the *Tetraodon
>> *genome. Can someone please help.
>>
>>  -Regards
>>
>> Swaraj Basu
>>
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>>
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>
>
> --
> Swaraj Basu
> PhD Student (Bioinformatics - Functional Genomics)
> Animal Physiology and Evolution
> Stazione Zoologica Anton Dohrn
> Naples
>



-- 
Swaraj Basu
PhD Student (Bioinformatics - Functional Genomics)
Animal Physiology and Evolution
Stazione Zoologica Anton Dohrn
Naples
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20140220/45d2c461/attachment.html>


More information about the Dev mailing list