[ensembl-dev] SQL query to retrieve gene sequence...
rob.sargent at utah.edu
Tue Dec 16 16:40:36 GMT 2014
A function/stored-procedure using a recursive CTE might be the way for
Steve to go.
On 12/16/2014 09:28 AM, Andrew Yates wrote:
> Hey Steve,
> The problem with using the database is that sequence is not stored
> against the top-level sequences annotation is held against. Instead
> sequence is held against the contig sequence regions which requires
> descending through the assembly table an unspecified number of times
> (once for each mapping e.g. chromosome -> supercontig -> contig).
> I would seriously *not* recommend doing this. Not only do you have to
> deal with descending down the assembly but also having to think about
> concatenating the sequence & paying attention to the orientation of
> assembly. Instead you could use the Perl API (probably not an option
> considering you’re a Python guy), BioMart (you can access unspliced
> gene sequence quite easily), the REST API or download the full genome
> sequence from FTP and doing subslices. The faindex index tool from
> htslib/samtools is pretty good at extracting arbitrary sequence from
> very large FASTA files.
> Andrew Yates - Ensembl Support Coordinator
> European Molecular Biology Laboratory
> European Bioinformatics Institute
> Wellcome Trust Genome Campus
> Hinxton, Cambridge
> CB10 1SD, United Kingdom
> Tel: +44-(0)1223-492538
> Fax: +44-(0)1223-494468
> Skype: andrewyatz
>> On 16 Dec 2014, at 16:15, Steve Moss <gawbul at gmail.com
>> <mailto:gawbul at gmail.com>> wrote:
>> Dear EnsEMBL Dev,
>> I'm trying to write a raw SQL query to retrieve the sequence for the
>> human BRCA2 gene to compare different methods of accessing EnsEMBL
>> data. I'm currently doing the following, but getting an empty set.
>> SELECT SUBSTRING(sequence, g.seq_region_start, g.seq_region_end)
>> FROM dna d
>> JOIN gene g
>> ON d.seq_region_id = g.seq_region_id
>> WHERE g.stable_id="ENSG00000139618"
>> What am I missing? I think I'm falling short on working out the
>> coord. system mapping stuff. Any pointers to help in fixing please?
>> Steve Moss
>> Steve Moss on about.me
>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>> Posting guidelines and subscribe/unsubscribe info:
>> Ensembl Blog: http://www.ensembl.info/
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Dev