[ensembl-dev] GenomicAlignBlock restriction
Benoît Ballester
benoit.ballester at inserm.fr
Thu Jul 17 11:32:58 BST 2014
Thanks Stephen for the detailed reply.
Happy to see that I finally got it right with GAB, GA, MLSS ;-)
Ben
On 17 Jul 2014, at 12:22, Stephen Fitzgerald wrote:
> Hi Ben, yes to all your questions.
>
> fetch_all_by_MethodLinkSpeciesSet_Slice returns the full genomic_align_block.
>
> restrict_between_reference_positions restricts the block using the reference genome coordinate positions supplied.
>
> The $skip_empty_GenomicAligns=1 will skip empty alignments.
>
>
> Below is a small example script, the output of which is:
>
> A---G
> -----
> AGAAG
> AGAAG
> A---G
> A---G
>
> #####
>
> A---G
> AGAAG
> AGAAG
> A---G
> A---G
>
>
> HTH,
> Stephen.
>
> ####### example script
> use strict;
> use Bio::EnsEMBL::Registry;
>
> Bio::EnsEMBL::Registry->load_registry_from_db(
> -host=>"ensembldb.ensembl.org", -user=>"anonymous",
> -port=>'5306');
>
> my $genomic_align_block_adaptor =
> Bio::EnsEMBL::Registry->get_adaptor(
> "Multi", "compara", "GenomicAlignBlock");
>
> my $method_link_species_set_adaptor =
> Bio::EnsEMBL::Registry->get_adaptor(
> "Multi", "compara", "MethodLinkSpeciesSet");
>
> my $methodLinkSpeciesSet = $method_link_species_set_adaptor->
> fetch_by_method_link_type_species_set_name("EPO", "primates");
>
> my ($human_start, $human_end, $human_chr) = (30727319, 30727320, "X");
>
> my $human_slice_adaptor =
> Bio::EnsEMBL::Registry->get_adaptor(
> "homo_sapiens", "core", "Slice");
>
> my $slice = $human_slice_adaptor->fetch_by_region(
> "chromosome", $human_chr, $human_start, $human_end);
>
> my $all_genomic_align_blocks = $genomic_align_block_adaptor->
> fetch_all_by_MethodLinkSpeciesSet_Slice(
> $methodLinkSpeciesSet, $slice);
>
> restrict(0);
> print "\n#####\n\n";
> restrict(1);
>
> sub restrict{
> my($skip_empty) = shift;
> foreach my $genomic_align_block( @{ $all_genomic_align_blocks }) {
> my $restricted_gab = $genomic_align_block->restrict_between_reference_positions($human_start, $human_end, undef, $skip_empty);
> foreach my $genomic_align(@{ $restricted_gab->get_all_GenomicAligns }){
> print $genomic_align->aligned_sequence, "\n";
> }
> }
> }
>
> #######
>
>
>
> On Wed, 16 Jul 2014, Benoît Ballester wrote:
>
>> Hi Compara/Ensembl,
>>
>> I'd like to have confirmations about some of the GenomicAlignBlock API logic.
>>
>> Given a human slice (300 to 400bp), I want to fetch a GAB to see if I could find an alignment across species. I want to know if that slice is aligneable in a couple of species.
>>
>>
>> I use $gab_adaptor->fetch_all_by_MethodLinkSpeciesSet_Slice to fetch the GenomicAlignBlock.
>>
>> But this returns the entire GenomicAlign for the given block ! Right ?
>>
>>
>> #-- Get GAB
>> my $slice = $slice_adaptor->fetch_by_region( 'chromosome', $seq_region_name, $seq_region_start, $seq_region_end );
>>
>> my $gabs = $gab_adaptor->fetch_all_by_MethodLinkSpeciesSet_Slice($mlss, $slice);
>>
>> if (scalar @$gabs > 0){ #-- There is a GAB
>>
>> foreach my $gab (@$gabs) {
>> #-- do something
>> # but this will return the entire GenomicAlign
>> }
>> }
>>
>>
>> So, correct me if I am wrong, I can restrict the GAB using the method restrict_between_reference_positions giving the start/end coordinates of my initial slice, and the boolean $skip_empty_GenomicAligns ?
>>
>>
>> Once I have the GAB I restrict it :
>>
>> my $gab2 = $gab->restrict_between_reference_positions($rf->seq_region_start, $rf->seq_region_end, undef ,1);
>>
>>
>> Could you please confirm that what I am doing is correct.
>> I want to do that from a human slice, and find if there is an alignment for a couple of species. I want to skip empty alignments, or long gaps. If this the correct way ?
>>
>> Thanks for the replies,
>>
>>
>> Ben
>>
>>
>> --
>> Benoît Ballester, PhD
>> Inserm U1090, TAGC
>> Campus de Luminy
>> 13288 Marseille Cedex 9
>> France
>> +33 4 91 82 87 39
>>
>>
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
--
Benoît Ballester, PhD
Inserm U1090, TAGC
Campus de Luminy
13288 Marseille Cedex 9
France
+33 4 91 82 87 39
More information about the Dev
mailing list