[ensembl-dev] Another API question about Alignmentsbetweenmouse / human
Eduardo Andrés León
eandres at cnio.es
Mon Jan 28 16:09:45 GMT 2013
Exactly !!!!
Thank you very much for your help !
Regards
On 28 Jan 2013, at 17:03, "Kathryn Beal" <kbeal at ebi.ac.uk> wrote:
> Hi,
> How about:
>
> foreach my $aln_slice (@{$align_slice->get_all_Slices()}) {
> my $slices = $aln_slice->get_all_underlying_Slices;
> foreach my $this_slice (@$slices) {
> print $aln_slice->genome_db->name . ":" . $this_slice->seq_region_name . ":" . $this_slice->start . ":" . $this_slice->end . ":" . $this_slice->strand . "\n";
> }
> }
>
> This gives the following output:
>
> mus_musculus:2:28403186:28403879:1
> homo_sapiens:9:100149814:100149843:-1
> homo_sapiens:GL000220.1:11378:11383:-1
> homo_sapiens:9:135978315:135979072:-1
>
> Cheers
> Kathryn
>
> Kathryn Beal, PhD
> European Bioinformatics Institute (EMBL-EBI)
> Wellcome Trust Genome Campus, Hinxton
> Cambridge CB10 1SD, UK
> Tel. +44 (0)1223 494458
> www.ensembl.org
>
> On 28 Jan 2013, at 15:40, Eduardo Andrés León wrote:
>
>> umm, but I lost the coordinates (which I really need them)
>>
>>
>> On 28 Jan 2013, at 14:35, "Kathryn Beal" <kbeal at ebi.ac.uk> wrote:
>>
>>> Hi,
>>> You can use the AlignSlice to get the alignment, i.e. add the following lines:
>>>
>>> my $align_slice_adaptor =
>>> Bio::EnsEMBL::Registry->get_adaptor("Multi", "compara", "AlignSlice");
>>>
>>> my $align_slice = $align_slice_adaptor->fetch_by_Slice_MethodLinkSpeciesSet($source_org_slice, $methodLinkSpeciesSet, 'expanded', 'restrict');
>>>
>>> print $alignIO $align_slice->get_SimpleAlign;
>>>
>>> I also used the "clustalw" as the format for AlignIO.
>>>
>>> I hope that helps,
>>> Cheers
>>> Kathryn
>>>
>>> Kathryn Beal, PhD
>>> European Bioinformatics Institute (EMBL-EBI)
>>> Wellcome Trust Genome Campus, Hinxton
>>> Cambridge CB10 1SD, UK
>>> Tel. +44 (0)1223 494458
>>> www.ensembl.org
>>>
>>> On 28 Jan 2013, at 10:27, Eduardo Andrés León wrote:
>>>
>>>> Dear all,
>>>> I'm trying to match the mouse sequence(2-28403186:28403879) into the human genome using ensembl 67.
>>>>
>>>> Using the web, I've got the following :
>>>>
>>>> http://may2012.archive.ensembl.org/Mus_musculus/Location/Compara_Alignments?align=410&db=core&r=2%3A28403186-28403879
>>>>
>>>> mus_musculus:2 > chromosome:NCBIM37:2:28403186:28403879:1
>>>> homo_sapiens:9 > chromosome:GRCh37:9:100149814:100149843:-1
>>>> supercontig:GRCh37:GL000220.1:11378:11383:-1
>>>> chromosome:GRCh37:9:135978315:135979072:-1
>>>>
>>>>
>>>>
>>>> But when I use the API, I obtain more than 55 fragments (attached as a zip file) :
>>>>
>>>> <alignment.3012.17691580863.txt.zip>
>>>>
>>>> The code for extracting the data is the following :
>>>>
>>>> getAlignMent(2,28403186,28403879);
>>>>
>>>> sub getAlignMent{
>>>> my ($source_org_chr,$source_org_start,$source_org_end)=@_;
>>>>
>>>> #Auto-configure the registry
>>>> Bio::EnsEMBL::Registry->load_registry_from_db(
>>>> -host=>"ensembldb.cnio.es",
>>>> -user=>"ensembl");
>>>>
>>>>
>>>> # Get the Compara Adaptor for MethodLinkSpeciesSets
>>>> my $method_link_species_set_adaptor =
>>>> Bio::EnsEMBL::Registry->get_adaptor(
>>>> "Multi", "compara", "MethodLinkSpeciesSet");
>>>>
>>>> # Get the MethodLinkSpecieSet for source_org-mouse lastz-net alignments
>>>> my $methodLinkSpeciesSet = $method_link_species_set_adaptor->
>>>> fetch_by_method_link_type_registry_aliases("BLASTZ_NET", ["mouse", "human"]);
>>>>
>>>> # Define the start and end positions for the alignment
>>>> # Get the source_org *core* Adaptor for Slices
>>>> my $source_org_slice_adaptor =
>>>> Bio::EnsEMBL::Registry->get_adaptor(
>>>> "mouse", "core", "Slice");
>>>>
>>>> # Get the slice corresponding to the region of interest
>>>> my $source_org_slice = $source_org_slice_adaptor->fetch_by_region(
>>>> "chromosome", $source_org_chr, $source_org_start, $source_org_end);
>>>>
>>>> # Get the Compara Adaptor for GenomicAlignBlocks
>>>> my $genomic_align_block_adaptor =
>>>> Bio::EnsEMBL::Registry->get_adaptor(
>>>> "Multi", "compara", "GenomicAlignBlock");
>>>>
>>>> # The fetch_all_by_MethodLinkSpeciesSet_Slice() returns a ref.
>>>> # to an array of GenomicAlingBlock objects (source_org is the reference species)
>>>> my $all_genomic_align_blocks = $genomic_align_block_adaptor->
>>>> fetch_all_by_MethodLinkSpeciesSet_Slice(
>>>> $methodLinkSpeciesSet, $source_org_slice, undef, undef, "restrict");
>>>>
>>>> # set up an AlignIO to format SimpleAlign output
>>>> my $outputAl="alignment." . rand(10000) . ".txt";
>>>> open(OUT,">$outputAl") || die "3 $!\n";
>>>> my $alignIO = Bio::AlignIO->newFh(-interleaved => 0,
>>>> -fh => \*OUT,
>>>> -format => 'pfam',
>>>> -idlength => 20);
>>>>
>>>> # print the restricted alignments
>>>> if (scalar(@{$all_genomic_align_blocks})==0){
>>>> open(NMR,">>chr$source_org_chr\_No_mapping_regions.txt") || die "$!\n";
>>>> print NMR "$source_org_chr\t$source_org_start\t$source_org_end\n";
>>>> close NMR;
>>>> return();
>>>> }
>>>> else{
>>>> foreach my $genomic_align_block ( @{ $all_genomic_align_blocks } ) {
>>>> print $alignIO $genomic_align_block->get_SimpleAlign;
>>>> }
>>>> close OUT;
>>>> }
>>>> }
>>>>
>>>> This same happens with other segments, but not all of them.
>>>>
>>>> So, can anybody tell me how to extract the same records the web shows ?
>>>>
>>>> Regards and thanks in advance !
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Dev mailing list Dev at ensembl.org
>>>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>>>> Ensembl Blog: http://www.ensembl.info/
>>>
>>> _______________________________________________
>>> Dev mailing list Dev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>>
>> ===================================================
>> Eduardo Andrés León
>> Tlfn: (+34) 91 732 80 00 / 91 224 69 00 (ext 5054/3063)
>> e-mail: eandres at cnio.es Fax: (+34) 91 224 69 76
>> Unidad de Bioinformática Bioinformatics Unit
>> Centro Nacional de Investigaciones Oncológicas
>> C.P.: 28029 Zip Code: 28029
>> C/. Melchor Fernández Almagro, 3 Madrid (Spain)
>> http://bioinfo.cnio.es http://bioinfo.cnio.es/people/eandres
>> ===================================================
>>
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
===================================================
Eduardo Andrés León
Tlfn: (+34) 91 732 80 00 / 91 224 69 00 (ext 5054/3063)
e-mail: eandres at cnio.es Fax: (+34) 91 224 69 76
Unidad de Bioinformática Bioinformatics Unit
Centro Nacional de Investigaciones Oncológicas
C.P.: 28029 Zip Code: 28029
C/. Melchor Fernández Almagro, 3 Madrid (Spain)
http://bioinfo.cnio.es http://bioinfo.cnio.es/people/eandres
===================================================
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20130128/7e82d3a1/attachment.html>
More information about the Dev
mailing list