[ensembl-dev] Obtaining 5'UTR sequences for a list of ensembl ids programmatically?
Benjamin Moore
bmoore at ebi.ac.uk
Wed Apr 24 10:13:43 BST 2024
Hi Allan,
No problem at all- very happy to help. With the GET sequence/region
endpoint, the chromosome number should be entered as a required
parameter instead. From the GET lookup endpoint, this is the
"seq_region_name" key-value pair. In the example you provided, this is
"1", so the URL should look like this instead:
https://rest.ensembl.org/sequence/region/mus_musculus/1:59521583..59522118:1?content-type=text/x-fasta
Best wishes
Ben
On 23/04/2024 21:48, Allan Kamau wrote:
>
>
> On Tue, Apr 23, 2024 at 10:43 PM Allan Kamau <kamauallan at gmail.com> wrote:
>
>
>
> On Tue, Apr 23, 2024 at 6:37 PM Benjamin Moore <bmoore at ebi.ac.uk>
> wrote:
>
> Hi Allan,
>
> I think the most straightforward way to retreieve the 5'UTR
> sequences
> for a list of Ensembl features (I assume you have a list of
> gene IDs,
> ENSG...) using the REST API is to use the Lookup endpoints
> with the
> expand and utr optional parameters to retreieve the genomic
> coordinates
> of the 5'UTRs of each transcript for your list of genes:
>
> https://rest.ensembl.org/documentation/info/lookup
>
> Then, you can use the coordinates from the first step as the
> input for
> the Sequence/region endpoints to retreieve the genomic
> sequence of the
> 5' UTRs:
>
> https://rest.ensembl.org/documentation/info/sequence_region
>
> I hope this helps.
>
> Best wishes
>
> Ben
>
> On 23/04/2024 15:30, Allan Kamau wrote:
> > Is there a way to obtain 5'UTR sequences given a list of
> ensembl ids
> > programmatically?
> >
> > I have a list of ensembl ids for which I would like to
> obtain the
> > 5'UTR region for each one of them programmatically hopefully
> via
> > ensembl rest using wget, or python (ensembl-rest).
> >
> > Kindly assist.
> >
> > Thanks.
> >
> > -Allan.
> >
> >
> >
> >
> > _______________________________________________
> > Dev mailing list Dev at ensembl.org
> > Posting guidelines and subscribe/unsubscribe info:
> https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org
> > Ensembl Blog: http://www.ensembl.info/
>
> --
> Dr. Ben Moore (he/him)
> Ensembl Outreach Manager
>
> European Bioinformatics Institute (EMBL-EBI)
> European Molecular Biology Laboratory
> Wellcome Trust Genome Campus
> Hinxton
> Cambridge
> CB10 1SD
> UK
>
> bmoore at ebi.ac.uk
> +44 (0)1223 494265
>
>
> Thank you Ben for your response. I am now stuck in defining the
> url for the https://rest.ensembl.org/sequence/region resource.
>
> I am using the ensembl id "ENSMUSG00000041075" in this example.
> The URL below provides the sequence features the ensembl id object
> "ENSMUSG00000041075".
>
> https://rest.ensembl.org/lookup/id/ENSMUSG00000041075?content-type=application/json;expand=1;utr=1
>
> This returns the query below
>
> {
> "ENSMUSG00000041075": {
> "seq_region_name": "1",
> "logic_name": "ensembl_havana_gene_mus_musculus",
> "end": 59526114,
> "biotype": "protein_coding",
> "version": 9,
> "db_type": "core",
> "object_type": "Gene",
> "strand": 1,
> "start": 59521583,
> "canonical_transcript": "ENSMUST00000114246.4",
> "Transcript": [
> {
> "biotype": "protein_coding",
> "version": 4,
> "db_type": "core",
> "object_type": "Transcript",
> "seq_region_name": "1",
> "logic_name":
> "ensembl_havana_transcript_mus_musculus",
> "end": 59526114,
> "Translation": {
> "id": "ENSMUSP00000109884",
> "length": 572,
> "start": 59522119,
> "end": 59523837,
> "version": 3,
> "object_type": "Translation",
> "db_type": "core",
> "Parent": "ENSMUST00000114246",
> "species": "mus_musculus"
> },
> "assembly_name": "GRCm39",
> "Parent": "ENSMUSG00000041075",
> "is_canonical": 1,
> "display_name": "Fzd7-201",
> "Exon": [
> {
> "species": "mus_musculus",
> "version": 4,
> "db_type": "core",
> "object_type": "Exon",
> "assembly_name": "GRCm39",
> "id": "ENSMUSE00000698652",
> "start": 59521583,
> "end": 59526114,
> "strand": 1,
> "seq_region_name": "1"
> }
> ],
> "UTR": [
> {
> "object_type": "five_prime_UTR",
> "db_type": "core",
> "assembly_name": "GRCm39",
> "Parent": "ENSMUST00000114246",
> "type": "five_prime_utr",
> "species": "mus_musculus",
> "seq_region_name": "1",
> "strand": 1,
> "id": "ENSMUST00000114246",
> "source": "ensembl_havana",
> "start": 59521583,
> "end": 59522118
> },
> {
> "type": "three_prime_utr",
> "species": "mus_musculus",
> "db_type": "core",
> "object_type": "three_prime_UTR",
> "assembly_name": "GRCm39",
> "Parent": "ENSMUST00000114246",
> "id": "ENSMUST00000114246",
> "source": "ensembl_havana",
> "end": 59526114,
> "start": 59523838,
> "seq_region_name": "1",
> "strand": 1
> }
> ],
> "species": "mus_musculus",
> "strand": 1,
> "start": 59521583,
> "id": "ENSMUST00000114246",
> "source": "ensembl_havana",
> "length": 4532
> }
> ],
> "id": "ENSMUSG00000041075",
> "description": "frizzled class receptor 7 [Source:MGI
> Symbol;Acc:MGI:108570]",
> "source": "ensembl_havana",
> "assembly_name": "GRCm39",
> "display_name": "Fzd7",
> "species": "mus_musculus"
> }
> }
>
> What would be formulation for the
> "https://rest.ensembl.org/sequence/region/" for the 5'UTR gene
> region given above.
>
> Below is the step where I am stuck.
> https://rest.ensembl.org/sequence/region/mus_musculus/<what_goes_here>:59522119..59523837:1?coord_system=seqlevel;content-type=text/x-fasta
>
> -Allan.
>
>
> What would be the url to obtain the five_prime_UTR region?
>
> I have tried the URL below but it finds no slice.
> https://rest.ensembl.org/sequence/region/mus_musculus/GRCm39:59521583..59522118:1?coord_system=seqlevel;content-type=text/x-fasta
>
>
> -Allan.
>
> _______________________________________________
> Dev mailing listDev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info:https://lists.ensembl.org/mailman/listinfo/dev_ensembl.org
> Ensembl Blog:http://www.ensembl.info/
--
Dr. Ben Moore (he/him)
Ensembl Outreach Manager
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Hinxton
Cambridge
CB10 1SD
UK
bmoore at ebi.ac.uk
+44 (0)1223 494265
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20240424/8e5bdd3e/attachment-0001.html>
More information about the Dev
mailing list