[ensembl-dev] Get genomic location for translatable cdna seq
Magali
mr6 at ebi.ac.uk
Fri Feb 8 09:59:03 GMT 2013
Hi Abhishek,
As Kieron mentioned, exon objects in ensembl have a
coding_region_start() method.
If, for the exons, you replace the feature->start() and feature->end()
methods with feature->coding_region_start() and
feature->coding_region_end(), you will get only the coding parts for
each exon.
If the entire exon is non-coding, it will return undefined.
Hope that helps,
Magali
On 07/02/13 17:30, Abhishek Niroula wrote:
> Thanks Kieron.
> I have pasted my code in here which I used to extract the exon
> information. ensembl_gene_transcript_id.txt file contains Ensembl gene
> and ensembl transcript. For each transcript, I want to obtain
> corresponding genome co-ordinate for each amino acid position. I am
> not pretty sure if somebody has already done this.
>
>
> #!/usr/bin/perl
>
> use strict;
> use warnings;
> use Bio::EnsEMBL::Registry;
> use Data::Dumper;
>
> sub feature2string
> {
> my $feature = shift;
>
> my $stable_id = $feature->stable_id();
> my $seq_region = $feature->slice->seq_region_name();
> my $start = $feature->start();
> my $end = $feature->end();
> my $strand = $feature->strand();
>
> return sprintf( "%s: %s:%d-%d (%+d)",
> $stable_id, $seq_region, $start, $end, $strand );
> }
>
> my $registry = "Bio::EnsEMBL::Registry";
> ## Load the databases into the registry
> $registry->load_registry_from_db( -host =>'ensembldb.ensembl.org
> <http://ensembldb.ensembl.org>', -user => 'anonymous' );
>
> my $gene_adaptor = $registry->get_adaptor( 'Human', 'Core',
> 'Transcript' );
>
> open MYFILE, "<ensembl_gene_transcript_id.txt" or die $!;
> my @lines = <MYFILE>;
> close (MYFILE);
> foreach my $line (@lines){
> print $line;
> my $substring=substr($line,0,-1);
> my @ids=split(/\|/,$substring);
> my $transcript=$ids[1];
> my $gene=$ids[0];
> ### Open a file for each gene to write the exons
> open (CDS, ">".$gene."_exon.txt") or die "open: $!";
>
> ### Now fetch all the exons for the transcript
> my $geneobj=$gene_adaptor->fetch_by_stable_id($ids[1]);
> my $cdsseq=$geneobj->translateable_seq();
> open(CDSSEQ, ">".$gene.".fa") or die "open: $!";
> print CDSSEQ ">".$gene."\n".$cdsseq."\n";
> close (CDSSEQ);
> my $exons=$geneobj->get_all_Exons();
> ## Just to print the exons loop across the array
> foreach my $exon ( @{ $exons } ) {
> my $exon_info= feature2string($exon);
> print CDS "".$exon_info."\n";
> }
> close (CDS);
> }
>
>
>
>
> On Thu, Feb 7, 2013 at 6:13 PM, Kieron Taylor <ktaylor at ebi.ac.uk
> <mailto:ktaylor at ebi.ac.uk>> wrote:
>
> Hi Abishek,
>
> We need you to provide more specifics in order to determine what
> the difficulty is.
>
> If you have Ensembl Exon objects, their coding_region_start() will
> inform you if the Exon does not code.
>
> We can be of more assistance if you can tell us more or provide
> code samples. There are several ways to approach the task and we
> wouldn't want to recommend the most difficult for you!
>
> Regards,
>
> --
> Kieron Taylor PhD.
> Ensembl Core team
> EBI
>
>
>
> On 30/01/2013 15:15, Abhishek Niroula wrote:
>
> Hello,
>
> I am trying to get genomic co-ordinates for translatable
> portion of some
> human cdna sequences. I could succesfully extract the
> transcript start
> and end coordinates and also coordinates for each exon in a
> transcript.
> But, all the exons in a protein may not be translatable. I am
> stuck at
> this point.
> My goal is to check if a given genomic co-ordinate in a
> chromosome is
> located in protein coding (translatable region) of the chromosome.
>
> Thanks for your effort in advance.
>
> --
> Best Reagrds,
> Abhishek Niroula
>
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
>
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
> Posting guidelines and subscribe/unsubscribe info:
> http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
>
>
>
>
> --
> Best Reagrds,
> Abhishek Niroula
>
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20130208/bac10b1b/attachment.html>
More information about the Dev
mailing list