[ensembl-dev] FW: Download human body map 2.0 transcript coordinates
Thibaut Hourlier
th3 at sanger.ac.uk
Mon Jan 30 14:43:39 GMT 2012
Dear Ying,
As I said on the link you are quoting, we recommend people to use the
perl API : http://www.ensembl.org/info/docs/Doxygen/core-api/index.html
The only way you can map the reads with a gene is with the
seq_region(_id/_start/_end/_strand) information.
If you know the stable id (ENSG000....) of the gene you are interested
in, it's quite simple with the API:
$db = new Bio::EnsEMBL::DBAdaptor(
-host => 'ensembldb.ensembl.org',
-port => 5306,
-user => 'anonymous',
-dbname => 'homo_sapiens_core_65_37');
$ga = $db->get_GeneAdaptor();
$gene = $ga->fetch_by_stable_id("ENSG000XXXX");
$slice = $gene->slice;
$rnaseqdb = new Bio::EnsEMBL::DBAdaptor(
-host => 'ensembldb.ensembl.org',
-port => 5306,
-user => 'anonymous',
-dbname => 'homo_sapiens_rnaseq_65_37');
$rnaseqsa = $rnaseqdb->get_SliceAdaptor();
$rnaseqslice = $rnaseqsa->fetch_by_name($slice->name);
@transcripts = @{$rnaseqslice->get_all_Transcripts('skeletal_rnaseq')};
foreach my $transcript (@transcripts) {
foreach my $sf (@{$transcript->get_all_supporting_features()}) {
#We print the number of reads that spanned accross the intron
print STDOUT $sf->hit_name, ' :', $sf->score, "\n";
}
}
The number of reads that span the introns is the score you can find in
the dna_align_feature table of the rnaseq database.
Regards,
Thibaut
On 27/01/12 19:09, Li, Ying L wrote:
Dear Thibaut,
I am trying to get the ensemble human bodymap with gene or transcript. And I followed your instruction at the this blog site:http://lists.ensembl.org/pipermail/dev/2011-August/001593.html
I am able to setup an oracle schema to do the following query:
SELECT t.* , sr.name
FROM rnaseq37_analysis a, rnaseq37_transcript t
LEFT JOIN rnaseq37_seq_region sr
ON sr.seq_region_id = t.seq_region_id
WHERE t.analysis_id = a.analysis_id
AND a.logic_name = 'skeletal_rnaseq'
;
TRANSCRIPT_ID
GENE_ID
ANALYSIS_ID
SEQ_REGION_ID
SEQ_REGION_START
SEQ_REGION_END
SEQ_REGION_STRAND
DISPLAY_XREF_ID
BIOTYPE
STATUS
DESCRIPTION
IS_CURRENT
CANONICAL_TRANSLATION_ID
STABLE_ID
VERSION
CREATED_DATE
MODIFIED_DATE
NAME
840249
585754
8244
27517
184298181
184300196
1
\N
protein_coding
PREDICTED
\N
1
593641
ROUGHT00000241809
1
2011-01-12 10:33:07
2011-01-12 10:33:07
3
840251
585756
8244
27523
74332013
74659111
-1
\N
protein_coding
PREDICTED
\N
1
593643
ROUGHT00000241811
1
2011-01-12 10:33:07
2011-01-12 10:33:07
8
840252
585758
8244
27523
74702071
74742711
-1
\N
protein_coding
PREDICTED
\N
1
593644
ROUGHT00000241812
1
2011-01-12 10:33:07
2011-01-12 10:33:07
8
840254
585759
8244
27523
74857620
74885297
-1
\N
protein_coding
PREDICTED
\N
1
593646
ROUGHT00000241814
1
2011-01-12 10:33:07
2011-01-12 10:33:07
8
840256
585761
8244
27523
74887723
74895618
1
\N
protein_coding
PREDICTED
\N
1
593648
ROUGHT00000241816
1
2011-01-12 10:33:07
2011-01-12 10:33:07
8
840257
585762
8244
27523
74903474
74917165
1
\N
protein_coding
PREDICTED
\N
1
593649
ROUGHT00000241817
1
2011-01-12 10:33:07
2011-01-12 10:33:07
8
840259
585764
8244
27523
74921628
74941367
1
\N
protein_coding
PREDICTED
\N
1
593651
ROUGHT00000241819
1
2011-01-12 10:33:07
2011-01-12 10:33:07
8
840261
585766
8244
27523
75015772
75019126
1
\N
protein_coding
PREDICTED
\N
1
593653
ROUGHT00000241820
1
2011-01-12 10:33:07
2011-01-12 10:33:07
8
840264
585768
8244
27519
15260701
15375468
-1
\N
protein_coding
PREDICTED
\N
1
593656
ROUGHT00000241821
1
2011-01-12 10:33:07
2011-01-12 10:33:07
12
840266
585770
8244
27519
15742384
15751506
1
\N
protein_coding
PREDICTED
\N
1
593658
ROUGHT00000241822
1
2011-01-12 10:33:07
2011-01-12 10:33:07
12
Now I need to map the gene_id or transcript_id to some kind of standard id (eg ensg00000*****) so that I can tell what gene is the gene_id regards to, do you know what is the best way to do so? if you can tell me how to map the gene_id? In additional, do you know if there is a '# of read" for the rnaseq data?
Thanks a lot for your help,
Best regards,
Ying
> Hi there,
>
> I am on your mailing list, so resubmitting this question -- see attached file.
>
> Thanks a lot for your help.
>
> Best,
> Ying
> -----Original Message-----
> From: dev-bounces at ensembl.org [mailto:dev-bounces at ensembl.org] On Behalf Of dev-owner at ensembl.org
> Sent: Wednesday, January 25, 2012 4:52 PM
> To: Li, Ying L {PXTP~Nutley}
> Subject: Re: [ensembl-dev] Download human body map 2.0 transcript coordinates
>
> The Ensembl dev mailing list only accepts postings from people who are subscribed. You can subscribe or unsubscribe at http://lists.ensembl.org/mailman/listinfo/dev
>
>
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> List admin (including subscribe/unsubscribe): http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20120130/87c6345e/attachment.html>
More information about the Dev
mailing list