[ensembl-dev] (no subject)
Andy Yates
ayates at ebi.ac.uk
Mon Apr 2 20:09:19 BST 2012
Hi Matt,
I've just run a query for this data and got the following row back
ENSG00000003249 ENST00000392973 90071273 -1 90071281 90076619
This seems to agree with the website claiming that ENST00000392973's coordinates are Chromosome 16: 90,071,281-90,076,619 reverse strand. There should be an option in BioMart to export your query as a URL; can you send this so we can see the query you are performing.
Best regards,
Andy
On 2 Apr 2012, at 18:39, "Healy, Matthew" <Matthew.Healy at bms.com> wrote:
> I am new to BIOMART and the Ensembl Perl API, so probably I am just confused. I would be grateful for some enlightenment from those with more experience.
>
> I am trying to map protein features into chromosomal nucleotide coordinates.
>
> First I use the fetch_by_stable_id() method of the transcript adaptor to get a transcript object given its ENSTxxx identifier.
> Then I use the get_all_ProteinFeatures() method of that transcript object to get all its protein features.
> Then I use the pep2genomic method of Bio::EnsEMBL::TranscriptMapper to map these coordinates into nucleotide space.
>
> Usually this works as I would expect it to work: if a protein domain feature spans multiple exons, then I get back multiple pairs of genomic coordinates.
>
> When the domain overlaps the start or the end of the translation, I also get a gap object in transcript nucleotide coordinates (start and end both zero or both minus one or both length of transcript plus one), indicating some of that domain is missing from this translation.
>
> However, I have also found an oddity in BioMart. I downloaded a table of transcript coordinates from BIOMART. In most cases,
> these coordinates are exactly the same as the coordinates displayed by the genome browser. But I have seen a few cases where
> they are different.
>
> For example, in the ENSEMBL genome browser right now the coordinates for ENST00000392973 are given as
> Chromosome 16: 90,071,281-90,086,526 reverse strand. But when I downloaded all transcripts for
> ENSG00000003249 using http://useast.ensembl.org/biomart/martview/ the relevant row of output is:
>
> Ensembl Gene ID Ensembl Transcript ID Ensembl Protein ID Gene Start (bp) Gene End (bp) Strand Transcript Start (bp) Transcript End (bp) Chromosome Name
>
> ENSG00000003249 ENST00000392973 ENSP00000376699 90071273 90086536 -1 90071281 90076619 16
>
> In which Transcript End (bp) is 90076619 versus 90,086,526 displayed in browser views.
>
> This message (including any attachments) may contain confidential, proprietary, privileged and/or private information. The information is intended to be for the use of the individual or entity designated above. If you are not the intended recipient of this message, please notify the sender immediately, and delete the message and any attachments. Any disclosure, reproduction, distribution or other use of this message or any attachments by an individual or entity other than the intended recipient is prohibited.
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20120402/de948d77/attachment.html>
More information about the Dev
mailing list