[ensembl-dev] Another API question about Alignments between mouse / human

Eduardo Andrés León eandres at cnio.es
Mon Jan 28 10:27:14 GMT 2013


Dear all,
	I'm trying to match the mouse sequence(2-28403186:28403879)  into the human genome using ensembl 67.

Using the web, I've got the following :

http://may2012.archive.ensembl.org/Mus_musculus/Location/Compara_Alignments?align=410&db=core&r=2%3A28403186-28403879

mus_musculus:2 > 	chromosome:NCBIM37:2:28403186:28403879:1
homo_sapiens:9 > 	chromosome:GRCh37:9:100149814:100149843:-1
supercontig:GRCh37:GL000220.1:11378:11383:-1
chromosome:GRCh37:9:135978315:135979072:-1



But when I use the API, I obtain more than 55 fragments (attached as a zip file) :



The code for extracting the data is the following :

getAlignMent(2,28403186,28403879);

	sub getAlignMent{
		my ($source_org_chr,$source_org_start,$source_org_end)=@_;

		#Auto-configure the registry
		Bio::EnsEMBL::Registry->load_registry_from_db(
			-host=>"ensembldb.cnio.es",
			-user=>"ensembl");


		# Get the Compara Adaptor for MethodLinkSpeciesSets
		my $method_link_species_set_adaptor =
		    Bio::EnsEMBL::Registry->get_adaptor(
		      "Multi", "compara", "MethodLinkSpeciesSet");

		# Get the MethodLinkSpecieSet for source_org-mouse lastz-net alignments
		my $methodLinkSpeciesSet = $method_link_species_set_adaptor->
			fetch_by_method_link_type_registry_aliases("BLASTZ_NET", ["mouse", "human"]);

		# Define the start and end positions for the alignment
		# Get the source_org *core* Adaptor for Slices
		my $source_org_slice_adaptor =
		    Bio::EnsEMBL::Registry->get_adaptor(
		      "mouse", "core", "Slice");

		# Get the slice corresponding to the region of interest
		my $source_org_slice = $source_org_slice_adaptor->fetch_by_region(
		    "chromosome", $source_org_chr, $source_org_start, $source_org_end);

		# Get the Compara Adaptor for GenomicAlignBlocks
		my $genomic_align_block_adaptor =
		    Bio::EnsEMBL::Registry->get_adaptor(
		      "Multi", "compara", "GenomicAlignBlock");

		# The fetch_all_by_MethodLinkSpeciesSet_Slice() returns a ref.
		# to an array of GenomicAlingBlock objects (source_org is the reference species) 
		my $all_genomic_align_blocks = $genomic_align_block_adaptor->
		    fetch_all_by_MethodLinkSpeciesSet_Slice(
		        $methodLinkSpeciesSet, $source_org_slice, undef, undef, "restrict");

		# set up an AlignIO to format SimpleAlign output
		my $outputAl="alignment." . rand(10000) . ".txt";
		open(OUT,">$outputAl") || die "3 $!\n";
		my $alignIO = Bio::AlignIO->newFh(-interleaved => 0,
		                                  -fh => \*OUT,
		                                  -format => 'pfam',
		                                  -idlength => 20);

		# print the restricted alignments
		if (scalar(@{$all_genomic_align_blocks})==0){
			open(NMR,">>chr$source_org_chr\_No_mapping_regions.txt") || die "$!\n";
			print NMR "$source_org_chr\t$source_org_start\t$source_org_end\n";
			close NMR;
			return();
		}
		else{
			foreach my $genomic_align_block ( @{ $all_genomic_align_blocks } ) {
				print $alignIO $genomic_align_block->get_SimpleAlign;
			}
			close OUT;
		}
	}

This same happens with other segments, but not all of them.

So, can anybody tell me how to extract the same records the web shows ?

Regards and thanks in advance !




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20130128/aa93ffb0/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: alignment.3012.17691580863.txt.zip
Type: application/zip
Size: 3778 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20130128/aa93ffb0/attachment.zip>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20130128/aa93ffb0/attachment-0001.html>


More information about the Dev mailing list