[ensembl-dev] Different UTRs for the same transcript?
Rhoda Kinsella
rhoda at ebi.ac.uk
Mon Mar 14 16:10:02 GMT 2011
Hi Holger
When you query the UTR information from mart, what you are actually
getting is one result row per exon. Therefore if you add the exon_id
to your attributes you will see that the results make more sense as
you get a unique row per exon in each transcript. I hope that makes
sense, but please get back to me if you have further questions.
Regards
Rhoda
On 14 Mar 2011, at 14:42, Holger Brandl wrote:
> Hello,
>
> I'm using BIomart to access Ensembl. I'm interested in UTR regions,
> so I'm using the following query:
> mart = useDataset("mmusculus_gene_ensembl", mart = useMart("ensembl"))
> utrInfos <- getBM(attributes=c('ensembl_gene_id',
> 'ensembl_transcript_id',
> '5_utr_start
> ','5_utr_end
> ','3_utr_end
> ','3_utr_start
> ','start_position
> ','end_position','transcript_start','transcript_end'),
> filters=c('ensembl_gene_id'),
> values=c('ENSMUSG00000018733'),mart=mart);
>
> However the result of this query seems to have a weired structure as
> it contains 4 rows for each transcript, from which two contain
> different 5' utr boundaries.
> In contrast, what I would expect is a single row for each transcript
> with 5' AND 3' utr information (if available).
> I've tried the same query for other genes, but the the results
> always have a similar structure.
>
> The same happens if I run my query through the webinterface. Here's
> the URL for the above mentioned example:
> http://www.ensembl.org/biomart/martview/f20886e0142055da2a2b0a9de30d5ca8/f20886e0142055da2a2b0a9de30d5ca8/f20886e0142055da2a2b0a9de30d5ca8?VIRTUALSCHEMANAME=default&ATTRIBUTES=mmusculus_gene_ensembl.default.structure.ensembl_gene_id
> |mmusculus_gene_ensembl.default.structure.ensembl_transcript_id|
> mmusculus_gene_ensembl.default.structure.5_utr_start|
> mmusculus_gene_ensembl.default.structure.5_utr_end|
> mmusculus_gene_ensembl.default.structure.3_utr_end|
> mmusculus_gene_ensembl.default.structure.3_utr_start|
> mmusculus_gene_ensembl.default.structure.start_position|
> mmusculus_gene_ensembl.default.structure.end_position|
> mmusculus_gene_ensembl.default.structure.transcript_start|
> mmusculus_gene_ensembl
>
>
> .default
> .structure
> .transcript_end
> &FILTERS
> =
> mmusculus_gene_ensembl
> .default
> .filters
> .ensembl_gene_id."ENSMUSG00000018733"&VISIBLEPANEL=resultspanel
> Do you have any ideas what the problem with my query could be?
>
> Best,
> Holger Brandl
> --
> Dr. Holger Brandl
> Bioinformatics Service
> Max Planck Institute of Molecular Cell Biology and Genetics
> Pfotenhauerstrasse 108
> 01307 Dresden, Germany
>
> Tel.: +49/351/210-2738
> Fax: +49 351 210 2000
> www: http://www.mpi-cbg.de
> _______________________________________________
> Dev mailing list
> Dev at ensembl.org
> http://lists.ensembl.org/mailman/listinfo/dev
Rhoda Kinsella Ph.D.
Ensembl Bioinformatician,
European Bioinformatics Institute (EMBL-EBI),
Wellcome Trust Genome Campus,
Hinxton
Cambridge CB10 1SD,
UK.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20110314/f3cfd6ec/attachment.html>
More information about the Dev
mailing list