[ensembl-dev] GenomicAlignBlock restriction

Stephen Fitzgerald stephenf at ebi.ac.uk
Thu Jul 17 11:22:07 BST 2014


Hi Ben, yes to all your questions.

fetch_all_by_MethodLinkSpeciesSet_Slice returns the full 
genomic_align_block.

restrict_between_reference_positions restricts the block using the 
reference genome coordinate positions supplied.

The $skip_empty_GenomicAligns=1 will skip empty alignments.


Below is a small example script, the output of which is:

A---G
-----
AGAAG
AGAAG
A---G
A---G

#####

A---G
AGAAG
AGAAG
A---G
A---G


HTH,
Stephen.

####### example script
use strict;
use Bio::EnsEMBL::Registry;

Bio::EnsEMBL::Registry->load_registry_from_db(
         -host=>"ensembldb.ensembl.org", -user=>"anonymous",
         -port=>'5306');

my $genomic_align_block_adaptor =
     Bio::EnsEMBL::Registry->get_adaptor(
       "Multi", "compara", "GenomicAlignBlock");

my $method_link_species_set_adaptor =
     Bio::EnsEMBL::Registry->get_adaptor(
       "Multi", "compara", "MethodLinkSpeciesSet");

my $methodLinkSpeciesSet = $method_link_species_set_adaptor->
         fetch_by_method_link_type_species_set_name("EPO", "primates");

my ($human_start, $human_end, $human_chr) = (30727319, 30727320, "X");

my $human_slice_adaptor =
     Bio::EnsEMBL::Registry->get_adaptor(
       "homo_sapiens", "core", "Slice");

my $slice = $human_slice_adaptor->fetch_by_region(
     "chromosome", $human_chr, $human_start, $human_end);

my $all_genomic_align_blocks = $genomic_align_block_adaptor->
     fetch_all_by_MethodLinkSpeciesSet_Slice(
         $methodLinkSpeciesSet, $slice);

restrict(0);
print "\n#####\n\n";
restrict(1);

sub restrict{
  my($skip_empty) = shift;
  foreach my $genomic_align_block( @{ $all_genomic_align_blocks }) {
   my $restricted_gab = 
$genomic_align_block->restrict_between_reference_positions($human_start, 
$human_end, undef, $skip_empty);
   foreach my $genomic_align(@{ $restricted_gab->get_all_GenomicAligns }){
    print $genomic_align->aligned_sequence, "\n";
   }
  }
}

#######



On Wed, 16 Jul 2014, Benoît Ballester wrote:

> Hi Compara/Ensembl,
>
> I'd like to have confirmations about some of the GenomicAlignBlock API logic.
>
> Given a human slice (300 to 400bp), I want to fetch a GAB to see if I could find an alignment across species. I want to know if that slice is aligneable in a couple of species.
>
>
> I use $gab_adaptor->fetch_all_by_MethodLinkSpeciesSet_Slice to fetch the GenomicAlignBlock.
>
> But this returns the entire GenomicAlign for the given block ! Right ?
>
>
>    #-- Get GAB
>    my $slice = $slice_adaptor->fetch_by_region( 'chromosome', $seq_region_name, $seq_region_start, $seq_region_end );
>
>    my $gabs = $gab_adaptor->fetch_all_by_MethodLinkSpeciesSet_Slice($mlss, $slice);
>
>    if (scalar @$gabs > 0){ #-- There is a GAB
>
>        foreach my $gab (@$gabs) {
>          #-- do something
> 	  # but this will return the entire GenomicAlign
> 	}
>     }
>
>
> So, correct me if I am wrong, I can restrict the GAB using the method restrict_between_reference_positions giving the start/end coordinates  of my initial slice, and the boolean $skip_empty_GenomicAligns ?
>
>
> Once I have the GAB I restrict it :
>
>         my $gab2 = $gab->restrict_between_reference_positions($rf->seq_region_start, $rf->seq_region_end, undef ,1);
>
>
> Could you please confirm that what I am doing is correct.
> I want to do that from a human slice, and find if there is an alignment for a couple of species. I want to skip empty alignments, or long gaps. If this the correct way ?
>
> Thanks for the replies,
>
>
> Ben
>
>
> --
> Benoît Ballester, PhD
> Inserm U1090, TAGC
> Campus de Luminy
> 13288 Marseille Cedex 9
> France
> +33 4 91 82 87 39
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>


More information about the Dev mailing list