[ensembl-dev] Conceptual Confusion about Strands for Data mining

Andy Yates ayates at ebi.ac.uk
Fri Feb 10 16:17:36 GMT 2012


Hi Sumir,

So firstly all genomic features are plotted to the forward strand. The issues comes in when we start to apply the concepts of 5' and 3' which is dependent on the strand. 

On 9 Feb 2012, at 09:21, Sumir Panji wrote:

>  Hello,
>  I am using Ensembl API version 65 to obtain a 1000bp slice for each gene for promoter content analysis. Basically my script uses the start position of each gene and expands by a 1000bp from the 5' end of a gene using co-ordinates obtained from the "fetch_by_gene_stable_id()" method. On the plus strand this would be a 1000bp upstream from the 5' end on a gene (the first UTR / exon). My conceptual difficulties are :
> 
> 1) Does this orientation hold / differ for genes on the negative strand?  

The seq_region_start of a transcript will be the start of the 5' UTR on the forward strand. On the reverse strand the seq_region_end will be the start of the 5' UTR.

> 
> 2) Do I need to reverse this when obtaining a slice from a gene located on the negative strand i.e instead of obtaining a 1000bp using the start co-ordinates I should use the end co-ordinates obtained by the "fetch_by_gene_stable_id()" method? 
> 

Yes; if you are on the forward strand you will have to add 1Kbp onto your seq_region_start and add 1Kbp to your seq_region_start on the reverse strand.

> The reason I am confused is that the API documentation states :
> "Note that for historical reasons the fetch_by_gene_stable_id() method always returns a slice on the forward strand even if the gene is on the reverse strand."
> 
> Does this mean that all 5' slices obtained for genes using co-ordinates from this method would ideally capture the transcription start site regardless of strand orientation?

No. The feature maintains its co-orindates. You still have to do the strand manipulation to get into the 5' and 3' space

Best regards,

Andy

Andrew Yates                   Ensembl Core Software Project Leader
EMBL-EBI                       Tel: +44-(0)1223-492538
Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
Cambridge CB10 1SD, UK         http://www.ensembl.org/





More information about the Dev mailing list