[ensembl-dev] GenomicAlignBlock restriction

Benoît Ballester benoit.ballester at inserm.fr
Thu Jul 17 11:32:58 BST 2014


Thanks Stephen for the detailed reply. 
Happy to see that I finally got it right with GAB, GA, MLSS ;-)

Ben

On 17 Jul 2014, at 12:22, Stephen Fitzgerald wrote:

> Hi Ben, yes to all your questions.
> 
> fetch_all_by_MethodLinkSpeciesSet_Slice returns the full genomic_align_block.
> 
> restrict_between_reference_positions restricts the block using the reference genome coordinate positions supplied.
> 
> The $skip_empty_GenomicAligns=1 will skip empty alignments.
> 
> 
> Below is a small example script, the output of which is:
> 
> A---G
> -----
> AGAAG
> AGAAG
> A---G
> A---G
> 
> #####
> 
> A---G
> AGAAG
> AGAAG
> A---G
> A---G
> 
> 
> HTH,
> Stephen.
> 
> ####### example script
> use strict;
> use Bio::EnsEMBL::Registry;
> 
> Bio::EnsEMBL::Registry->load_registry_from_db(
>        -host=>"ensembldb.ensembl.org", -user=>"anonymous",
>        -port=>'5306');
> 
> my $genomic_align_block_adaptor =
>    Bio::EnsEMBL::Registry->get_adaptor(
>      "Multi", "compara", "GenomicAlignBlock");
> 
> my $method_link_species_set_adaptor =
>    Bio::EnsEMBL::Registry->get_adaptor(
>      "Multi", "compara", "MethodLinkSpeciesSet");
> 
> my $methodLinkSpeciesSet = $method_link_species_set_adaptor->
>        fetch_by_method_link_type_species_set_name("EPO", "primates");
> 
> my ($human_start, $human_end, $human_chr) = (30727319, 30727320, "X");
> 
> my $human_slice_adaptor =
>    Bio::EnsEMBL::Registry->get_adaptor(
>      "homo_sapiens", "core", "Slice");
> 
> my $slice = $human_slice_adaptor->fetch_by_region(
>    "chromosome", $human_chr, $human_start, $human_end);
> 
> my $all_genomic_align_blocks = $genomic_align_block_adaptor->
>    fetch_all_by_MethodLinkSpeciesSet_Slice(
>        $methodLinkSpeciesSet, $slice);
> 
> restrict(0);
> print "\n#####\n\n";
> restrict(1);
> 
> sub restrict{
> my($skip_empty) = shift;
> foreach my $genomic_align_block( @{ $all_genomic_align_blocks }) {
>  my $restricted_gab = $genomic_align_block->restrict_between_reference_positions($human_start, $human_end, undef, $skip_empty);
>  foreach my $genomic_align(@{ $restricted_gab->get_all_GenomicAligns }){
>   print $genomic_align->aligned_sequence, "\n";
>  }
> }
> }
> 
> #######
> 
> 
> 
> On Wed, 16 Jul 2014, Benoît Ballester wrote:
> 
>> Hi Compara/Ensembl,
>> 
>> I'd like to have confirmations about some of the GenomicAlignBlock API logic.
>> 
>> Given a human slice (300 to 400bp), I want to fetch a GAB to see if I could find an alignment across species. I want to know if that slice is aligneable in a couple of species.
>> 
>> 
>> I use $gab_adaptor->fetch_all_by_MethodLinkSpeciesSet_Slice to fetch the GenomicAlignBlock.
>> 
>> But this returns the entire GenomicAlign for the given block ! Right ?
>> 
>> 
>>   #-- Get GAB
>>   my $slice = $slice_adaptor->fetch_by_region( 'chromosome', $seq_region_name, $seq_region_start, $seq_region_end );
>> 
>>   my $gabs = $gab_adaptor->fetch_all_by_MethodLinkSpeciesSet_Slice($mlss, $slice);
>> 
>>   if (scalar @$gabs > 0){ #-- There is a GAB
>> 
>>       foreach my $gab (@$gabs) {
>>         #-- do something
>> 	  # but this will return the entire GenomicAlign
>> 	}
>>    }
>> 
>> 
>> So, correct me if I am wrong, I can restrict the GAB using the method restrict_between_reference_positions giving the start/end coordinates  of my initial slice, and the boolean $skip_empty_GenomicAligns ?
>> 
>> 
>> Once I have the GAB I restrict it :
>> 
>>        my $gab2 = $gab->restrict_between_reference_positions($rf->seq_region_start, $rf->seq_region_end, undef ,1);
>> 
>> 
>> Could you please confirm that what I am doing is correct.
>> I want to do that from a human slice, and find if there is an alignment for a couple of species. I want to skip empty alignments, or long gaps. If this the correct way ?
>> 
>> Thanks for the replies,
>> 
>> 
>> Ben
>> 
>> 
>> --
>> Benoît Ballester, PhD
>> Inserm U1090, TAGC
>> Campus de Luminy
>> 13288 Marseille Cedex 9
>> France
>> +33 4 91 82 87 39
>> 
>> 
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

--
Benoît Ballester, PhD
Inserm U1090, TAGC
Campus de Luminy
13288 Marseille Cedex 9
France
+33 4 91 82 87 39





More information about the Dev mailing list