[ensembl-dev] newbie with Ensembl API- alignments
Stephen Fitzgerald
stephenf at ebi.ac.uk
Mon Jan 13 15:01:55 GMT 2014
Hi Manuel, you can also try our REST service, which will give you the
option of accessing the data (output in JSON format) in a number of ways:
http://beta.rest.ensembl.org/documentation/info/genomic_alignment_block_region
for example (using human coords 19:7184302-7184344):
bash> curl 'http://beta.rest.ensembl.org/alignment/block/region/homo_sapiens/19:7184302-7184344:1?species_set_group=primates' -H 'Content-type:application/json'
Also, if you wish to use the perl API, here is a script for retrieving the
alignments in phylip format:
bash> perl script.pl > out.phylip
################ script.pl
use strict;
use warnings;
use Data::Dumper;
use Bio::AlignIO;
use Bio::EnsEMBL::Registry;
#Auto-configure the registry
Bio::EnsEMBL::Registry->load_registry_from_db(
-host=>"ensembldb.ensembl.org", -user=>"anonymous",
-port=>'5306', db_version => 74);
# Get the Compara Adaptor for MethodLinkSpeciesSets
my $method_link_species_set_adaptor =
Bio::EnsEMBL::Registry->get_adaptor(
"Multi", "compara", "MethodLinkSpeciesSet");
my $methodLinkSpeciesSet = $method_link_species_set_adaptor->
fetch_by_method_link_type_species_set_name("EPO", "primates");
# Define the start and end positions for the alignment
my ($ref_start, $ref_end) = (7184302, 7184344);
# Get the reference species *core* Adaptor for Slices
my $ref_slice_adaptor =
Bio::EnsEMBL::Registry->get_adaptor(
"homo_sapiens", "core", "Slice");
# Get the slice corresponding to the region of interest
my $ref_slice = $ref_slice_adaptor->fetch_by_region(
"chromosome", 19, $ref_start, $ref_end);
# Get the Compara Adaptor for GenomicAlignBlocks
my $genomic_align_block_adaptor =
Bio::EnsEMBL::Registry->get_adaptor(
"Multi", "compara", "GenomicAlignBlock");
# The fetch_all_by_MethodLinkSpeciesSet_Slice() returns a ref.
# to an array of GenomicAlingBlock objects (human is the reference species)
my $all_genomic_align_blocks = $genomic_align_block_adaptor->
fetch_all_by_MethodLinkSpeciesSet_Slice(
$methodLinkSpeciesSet, $ref_slice);
# set up an AlignIO to format SimpleAlign output
my $alignIO = Bio::AlignIO->newFh(-interleaved => 0,
-fh => \*STDOUT,
-format => 'phylip',
-idlength => 30);
# print the restricted alignments
foreach my $genomic_align_block( @{ $all_genomic_align_blocks }) {
my $restricted_gab =
$genomic_align_block->restrict_between_reference_positions($ref_start,
$ref_end);
print $alignIO $restricted_gab->get_SimpleAlign;
}
################ out.phylip
6 44
homo_sapiens/19 cccc-agacccacatccagaactcacttgctggaattcatcgtg
pongo_abelii/19 cccc-agacctacatccagaactcacttgctggaattcatcgtg
pan_troglodytes/19 cccc-agacccacaaccagaactcacttgctggaattcatcgtg
gorilla_gorilla/19 cccc-agacccacatctagaactcacttgctggaattcatcgtg
callithrix_jacchus/22 ccccaagacccacatgcaggactcacttgctggaattcatcgtg
macaca_mulatta/19 cccc-ggacccacatacagaactcacttgctggaattcatcgtg
Cheers,
Stephen.
On Mon, 13 Jan 2014, Emily Pritchard wrote:
> Hi Manuel
>
> Have you seen our API course on EBI Train Online:
> http://www.ebi.ac.uk/training/online/course/ensembl-filmed-api-workshop
>
> The Core section will introduce you to the API itself, and the Compara module for alignments.
>
> Hope this helps
>
> Emily
>
> On 13/01/2014 13:32, Manuel Rodríguez Pascual wrote:
> I am just starting working with Ensembl API. I am not really experienced neither with Ensembl or bioinformatics itself,so I am stuck
> in a problem that seems to be easy to solve.
>
>
> As a test and proof of concept, I am interested to retrieve information of a comparison in phylip format. In particular, the provided
> example
> http://www.ensembl.org/Homo_sapiens/Gene/Compara_Alignments?align=548&db=core&g=ENSG00000171105&r=19%3A7112266-7294045
>
> when exported into a phylip format,
>
> http://www.ensembl.org/Homo_sapiens/Export/Output/Location/Alignment?align=548;db=core;g=ENSG00000171105;output=alignment;r=19:7112266-7294045;
> format=phylip;_format=Text
>
> I have however seen that the employment of wget is discouraged and the Ensembl API should be used instead.
>
> Reading the available documentation, I fell that I should use the Compara API Tutorial,
> http://www.ensembl.org/info/docs/api/compara/compara_tutorial.html
>
> but I don't really understand it or how to apply it to my objective.
>
> The question is, can anyone orient me on how to proceed?
>
>
>
> Thanks for your attention,
>
>
>
> Manuel
>
>
> --
> Dr. Manuel Rodríguez-Pascual
> skype: manuel.rodriguez.pascual
> phone: (+34) 913466173 // (+34) 679925108
>
> CIEMAT-Moncloa
> Edificio 22, desp. 1.25
> Avenida Complutense, 40
> 28040- MADRID
> SPAIN
>
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
> --
> Dr Emily Pritchard
> Ensembl Outreach Officer
>
> European Bioinformatics Institute (EMBL-EBI)
> European Molecular Biology Laboratory
> Wellcome Trust Genome Campus
> Hinxton
> Cambridge
> CB10 1SD
> UK
>
>
More information about the Dev
mailing list