[ensembl-dev] Download human body map 2.0 transcript coordinates
Thibaut Hourlier
th3 at sanger.ac.uk
Mon Aug 22 11:09:17 BST 2011
Hi Jyoti
We used the core schema
(http://www.ensembl.org/info/docs/api/index.html) for the rnaseq
database, here are some SQL queries:
To get all the transcripts, the first column is the name of the
chromosome:
SELECT sr.name, t.* from transcript t left join seq_region sr on
sr.seq_region_id = t.seq_region_id;
Get all the transcripts build from the skeletal cells on chromosome 2:
SELECT t.* FROM analysis a, transcript t LEFT JOIN seq_region sr ON
sr.seq_region_id = t.seq_region_id WHERE t.analysis_id = a.analysis_id
AND sr.name = 2 AND a.logic_name = "skeletal_rnaseq";
Get the exons for the transcript ROUGHT00000000004:
SELECT sr.name, e.seq_region_start, e.seq_region_end,
e.seq_region_strand FROM seq_region sr, transcript_stable_id tsi, exon e
LEFT JOIN exon_transcript et ON et.exon_id = e.exon_id WHERE
et.transcript_id = tsi.transcript_id AND sr.seq_region_id =
e.seq_region_id AND tsi.stable_id = "ROUGHT00000000004";
BUT I recommend you to use the perl API,
http://www.ensembl.org/info/docs/Doxygen/core-api/main.html :
$db = new Bio::EnsEMBL::DBAdaptor(
-host => 'ensembldb.ensembl.org',
-port => 5306O,
-user => 'anonymous',
-dbname => 'homo_sapiens_rnaseq_63_37');
$slice_adaptor = $db->get_SliceAdaptor();
$chr_slice = $slice_adaptor->fetch_by_region( 'chromosome', '2' );
foreach $transcript (@{chr_slice->get_all_Transcripts()}) {
...
foreach $exon (@{$transcript->get_all_Exons()}) {
...
}
...
}
or
foreach $transcript
(@{chr_slice->get_all_Transcripts('skeletal_rnaseq')}) {
...
}
Regards
Thibaut
On Fri, 2011-08-19 at 14:23 -0400, Jyoti Shah wrote:
> Hi Thibaut,
>
>
> Thansk for your response. The schema seems complicated and I could not
> find any proper documentation on this database instance. Can you point
> me to the SQL queries that can help me fetch transcripts for Human
> body map 2.0 from this MYSQL instance? I could see a file
> transcript.txt with start and end position but there is no chromosome
> number attached to it. Where can I download these build transcripts
> from?
>
>
> Thanks,
> Jyoti
>
> On Fri, Aug 19, 2011 at 12:12 PM, Thibaut Hourlier <th3 at sanger.ac.uk>
> wrote:
> Hi
> you can use either the ftp site and create a local instance of
> the
> database:
> ftp://ftp.ensembl.org/pub/current_mysql/homo_sapiens_rnaseq_63_37/
>
> or you can use the public MySQL Server:
>
> $db = new Bio::EnsEMBL::DBAdaptor(
> -host => 'ensembldb.ensembl.org',
> -port => 5306O,
> -user => 'anonymous',
> -dbname => 'homo_sapiens_rnaseq_63_37');
> Then if you want only some tissues you can filter with the
> logic_name:
> +-----------------------+
> | logic_name |
> +-----------------------+
> | adipose_rnaseq |
> | adrenal_rnaseq |
> | blood_rnaseq |
> | brain_rnaseq |
> | breast_rnaseq |
> | colon_rnaseq |
> | heart_rnaseq |
> | kidney_rnaseq |
> | liver_rnaseq |
> | lung_rnaseq |
> | lymph_rnaseq |
> | ovary_rnaseq |
> | prostate_rnaseq |
> | skeletal_rnaseq |
> | testes_rnaseq |
> | thyroid_rnaseq |
> +-----------------------+
>
> Regards
> Thibaut
>
>
> On Fri, 2011-08-19 at 09:18 -0400, Jyoti Shah wrote:
> > Hi,
> >
> >
> > What is the best way to download the transcripts that were
> built using
> > Illumina data (Human body map 2.0) explained here:
> >
> >
> >
> http://www.ensembl.info/blog/2011/05/24/human-bodymap-2-0-data-from-illumina/
> >
> >
> > I need to download the genomic coordinates.
> >
> >
> > Thanks!
>
> > _______________________________________________
> > Dev mailing list Dev at ensembl.org
> > List admin (including subscribe/unsubscribe):
> http://lists.ensembl.org/mailman/listinfo/dev
> > Ensembl Blog: http://www.ensembl.info/
>
>
>
> --
> The Wellcome Trust Sanger Institute is operated by Genome
> Research
> Limited, a charity registered in England with number 1021457
> and a
> company registered in England with number 2742969, whose
> registered
> office is 215 Euston Road, London, NW1 2BE.
>
>
More information about the Dev
mailing list