[ensembl-dev] (no subject)
Healy, Matthew
Matthew.Healy at bms.com
Mon Apr 2 20:19:29 BST 2012
Hmm, this is puzzling.
Here is the BIOMART URL:
http://useast.ensembl.org/biomart/martview/925acce10b0aceca635143e46ece03a0?VIRTUALSCHEMANAME=default&ATTRIBUTES=hsapiens_gene_ensembl.default.feature_page.ensembl_gene_id|hsapiens_gene_ensembl.default.feature_page.ensembl_transcript_id|hsapiens_gene_ensembl.default.feature_page.ensembl_peptide_id|hsapiens_gene_ensembl.default.feature_page.chromosome_name|hsapiens_gene_ensembl.default.feature_page.start_position|hsapiens_gene_ensembl.default.feature_page.end_position|hsapiens_gene_ensembl.default.feature_page.transcript_start|hsapiens_gene_ensembl.default.feature_page.transcript_end&FILTERS=hsapiens_gene_ensembl.default.filters.ensembl_gene_id."ENSG00000003249"&VISIBLEPANEL=resultspanel
Following is copied and pasted from the Results of that BIOMART query, my only edit was adding newlines above and below the line in question:
Ensembl Gene ID Ensembl Transcript ID Ensembl Protein ID Chromosome Name Gene Start (bp) Gene End (bp) Transcript Start (bp) Transcript End (bp)
ENSG00000003249 ENST00000304733 ENSP00000306407 16 90071273 90086536 90071273 90076529
ENSG00000003249 ENST00000002501 ENSP00000002501 16 90071273 90086536 90071273 90085881
ENSG00000003249 ENST00000566725 16 90071273 90086536 90071281 90074169
ENSG00000003249 ENST00000392973 ENSP00000376699 16 90071273 90086536 90071281 90076619
ENSG00000003249 ENST00000568838 ENSP00000457625 16 90071273 90086536 90072597 90086536
ENSG00000003249 ENST00000568662 16 90071273 90086536 90072627 90075480
ENSG00000003249 ENST00000568330 ENSP00000456573 16 90071273 90086536 90072899 90086288
Here is a URL from the ENSEMBL web site:
http://useast.ensembl.org/Homo_sapiens/Search/Details?species=Homo_sapiens;idx=Transcript;end=1;q=ENST00000392973
Following is copied and pasted from that URL:
1 Transcript matches your query ('ENST00000392973') in Human
DBNDD1-201 [ Ensembl: ENST00000392973 ]
Description
dysbindin (dystrobrevin binding protein 1) domain containing 1 [Source:HGNC Symbol;Acc:28455] [Type: protein coding Ensembl]
Location
16:90071281-90086526:-1
Source
e66
________________________________________
From: dev-bounces at ensembl.org [dev-bounces at ensembl.org] On Behalf Of Andy Yates [ayates at ebi.ac.uk]
Sent: Monday, April 02, 2012 3:09 PM
To: Ensembl developers list
Cc: dev at ensembl.org
Subject: Re: [ensembl-dev] (no subject)
Hi Matt,
I've just run a query for this data and got the following row back
ENSG00000003249<http://www.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000003249> ENST00000392973<http://www.ensembl.org/Homo_sapiens/Transcript/Summary?db=core;t=ENST00000392973> 90071273<http://www.ensembl.org/Homo_sapiens/contigview?chr=16&vc_start=90071273&vc_end=90086536> -1 90071281<http://www.ensembl.org/Homo_sapiens/contigview?chr=16&vc_start=90071281&vc_end=90076619> 90076619<http://www.ensembl.org/Homo_sapiens/contigview?chr=16&vc_start=90071281&vc_end=90076619>
This seems to agree with the website claiming that ENST00000392973's coordinates are Chromosome 16: 90,071,281-90,076,619<http://www.ensembl.org/Homo_sapiens/Location/View?db=core;g=ENSG00000003249;r=16:90071281-90076619;t=ENST00000392973> reverse strand. There should be an option in BioMart to export your query as a URL; can you send this so we can see the query you are performing.
Best regards,
Andy
On 2 Apr 2012, at 18:39, "Healy, Matthew" <Matthew.Healy at bms.com<mailto:Matthew.Healy at bms.com>> wrote:
I am new to BIOMART and the Ensembl Perl API, so probably I am just confused. I would be grateful for some enlightenment from those with more experience.
I am trying to map protein features into chromosomal nucleotide coordinates.
First I use the fetch_by_stable_id() method of the transcript adaptor to get a transcript object given its ENSTxxx identifier.
Then I use the get_all_ProteinFeatures() method of that transcript object to get all its protein features.
Then I use the pep2genomic method of Bio::EnsEMBL::TranscriptMapper to map these coordinates into nucleotide space.
Usually this works as I would expect it to work: if a protein domain feature spans multiple exons, then I get back multiple pairs of genomic coordinates.
When the domain overlaps the start or the end of the translation, I also get a gap object in transcript nucleotide coordinates (start and end both zero or both minus one or both length of transcript plus one), indicating some of that domain is missing from this translation.
However, I have also found an oddity in BioMart. I downloaded a table of transcript coordinates from BIOMART. In most cases,
these coordinates are exactly the same as the coordinates displayed by the genome browser. But I have seen a few cases where
they are different.
For example, in the ENSEMBL genome browser right now the coordinates for ENST00000392973 are given as
Chromosome 16: 90,071,281-90,086,526 reverse strand. But when I downloaded all transcripts for
ENSG00000003249 using http://useast.ensembl.org/biomart/martview/ the relevant row of output is:
Ensembl Gene ID Ensembl Transcript ID Ensembl Protein ID Gene Start (bp) Gene End (bp) Strand Transcript Start (bp) Transcript End (bp) Chromosome Name
ENSG00000003249 ENST00000392973 ENSP00000376699 90071273 90086536 -1 90071281 90076619 16
In which Transcript End (bp) is 90076619 versus 90,086,526 displayed in browser views.
This message (including any attachments) may contain confidential, proprietary, privileged and/or private information. The information is intended to be for the use of the individual or entity designated above. If you are not the intended recipient of this message, please notify the sender immediately, and delete the message and any attachments. Any disclosure, reproduction, distribution or other use of this message or any attachments by an individual or entity other than the intended recipient is prohibited.
_______________________________________________
Dev mailing list Dev at ensembl.org<mailto:Dev at ensembl.org>
List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
Ensembl Blog: http://www.ensembl.info/
This message (including any attachments) may contain confidential, proprietary, privileged and/or private information. The information is intended to be for the use of the individual or entity designated above. If you are not the intended recipient of this message, please notify the sender immediately, and delete the message and any attachments. Any disclosure, reproduction, distribution or other use of this message or any attachments by an individual or entity other than the intended recipient is prohibited.
More information about the Dev
mailing list