[ensembl-dev] exon table does not map to dna table

mag mr6 at ebi.ac.uk
Thu Sep 25 09:07:21 BST 2014


Hi Joshua,

Exons and other features tend to be stored on toplevel sequences, which 
are generally chromosomes.
Dna sequence however is stored on the contig level.
The assembly table contains information to map a contig sequence to a 
chromosome.

Retrieving dna sequence directly from the mysql schema is tricky in the 
best of case.
This is why we recommend using Biomart, the perl API 
(http://www.ensembl.org/info/docs/api/index.html) or REST queries 
(http://rest.ensembl.org) for this type of use.


Regards,
Magali

On 24/09/2014 22:53, Joshua Bradley wrote:
> I am using the the latest release of the human genome, 
> homo_sapiens_core_76_38.
>
> I am trying to map exons to their corresponding dna sequence. I'm 
> running a local instance of the sql tables to make things faster. My 
> approach was to take the seq_region_id in the exon table and use that 
> to find the sequence in the dna table. . The schema 
> <http://www.ensembl.org/info/docs/api/core/core_schema.html> online 
> seems to suggest that I can use the seq_region_id to do this but I am 
> not getting any results back from my sql queries.
>
> I've verified what the schema says, that the seq_region table has a 
> 1-1 mapping to the dna table by comparing counts between the two with 
> the following queries.
>
> SELECT COUNT(*) FROM seq_region INNER JOIN dna ON 
> seq_region.seq_region_id=dna.seq_region_id;
>
> SELECT COUNT(*) FROM dna;
>
>
> This was to be expected, however when I try to do a join between the 
> exon table and dna table, I get a count of 0
>
> SELECT COUNT(*) FROM exon INNER JOIN dna ON 
> exon.seq_region_id=dna.seq_region_id;
>
> which tells me the seq_region_id is not a 1-1 mapping. I am able to 
> use biomart 
> <http://www.ensembl.org/biomart/martview/96b01e4e2043c838fc28f0969c5914a6?VIRTUALSCHEMANAME=default&ATTRIBUTES=hsapiens_gene_ensembl.default.sequences.ensembl_gene_id%7Chsapiens_gene_ensembl.default.sequences.ensembl_transcript_id%7Chsapiens_gene_ensembl.default.sequences.gene_exon%7Chsapiens_gene_ensembl.default.sequences.ensembl_exon_id&FILTERS=hsapiens_gene_ensembl.default.filters.source.%22ensembl%22&VISIBLEPANEL=resultspanel> (this 
> is a sample query) to get the exon-to-sequence mapping but I would 
> like a more automated approach. Can someone explain how biomart is 
> able to do it? It's possible I am interpreting the schema wrong. How 
> does seq_region_id in the exon table get mapped to seq_region_id in 
> the dna table? I appreciate any help.
>
>
> Josh Bradley
> Graduate Student
> University of Maryland - College Park
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20140925/82c1a935/attachment.html>


More information about the Dev mailing list