[ensembl-dev] Urgent information to extract the refseq_mrna and HGNC symbol for given gene from ENSEMBL

Andy Yates ayates at ebi.ac.uk
Fri Aug 9 17:30:39 BST 2013


Hi Monica,

I'm sorry to tell you that we have had to revert the change Kieron made and this fix will not be released. We have found internal dependencies reliant on the existing functionality. Kieron's original suggestion of iterating through the array is quite safe but you could also try:

my @transcripts = @{$gene->get_all_Transcripts()};
while(my $transcript = shift @transcripts) {
  #do something
}

Taking a copy of the array will prevent you from modifying the object's internal structure.

Regards,

Andy

------------
Andrew Yates - Ensembl Core Software Project Leader
European Bioinformatics Institute (EMBL-EBI)
European Molecular Biology Laboratory
Wellcome Trust Genome Campus
Hinxton
Cambridge CB10 1SD
Tel: +44-(0)1223-492538
Fax: +44-(0)1223-494468
http://www.ensembl.org/

On 26 Jun 2013, at 17:47, Kieron Taylor <ktaylor at ebi.ac.uk> wrote:

> Dear Monica,
> 
> You have unwittingly uncovered a weakness in the way $gene->get_all_Transcripts works. Your code was discarding the Transcripts belonging to the Gene, hence it was unable to find the dblinks later on.
> 
> I have committed a small change to the CVS release of ensembl, that will surface in e73 and prevent this from happening. If you're not using CVS, then the easiest way to get the right answer is to discard the while loop and replace it with foreach, e.g.
> 
> foreach my $transcript (@{$gene->get_all_Transcripts}) {
>   *Your code here*
> }
> 
> Regards,
> 
> Kieron
> 
> -- 
> Kieron Taylor PhD.
> Ensembl Core team
> EBI
> 
> On 26/06/2013 16:10, Manam, Monica (NIH/NCI) [F] wrote:
>> Hello Magali!
>> 
>> I have tried the code you gave me! It worked for HGNC symbol.
>> But for Refseq , I am not getting any value!
>> 
>> The gene is MSGN1 : (HGNC symbol )
>> 
>> My code:
>> -----------------------------------------------------------------------------------------------------------------------
>> #!/opt/nasapps/applibs/perl-5.16.2/bin/perl
>> use Bio::EnsEMBL::Registry;
>> 
>> my $registry = 'Bio::EnsEMBL::Registry';
>> 
>> $registry->load_registry_from_db(
>>     -host => 'ensembldb.ensembl.org', # alternatively 'useastdb.ensembl.org'
>>     -user => 'anonymous'
>> );
>> 
>> sub feature2string
>> {
>>     my $feature = shift;
>> 
>>     my $stable_id  = $feature->stable_id();
>>     my $seq_region = $feature->slice->seq_region_name();
>>     my $start      = $feature->start();
>>     my $end        = $feature->end();
>>     my $strand     = $feature->strand();
>> 
>>     return sprintf( "%s: %s:%d-%d (%+d)",
>>         $stable_id, $seq_region, $start, $end, $strand );
>> }
>> 
>> my $slice_adaptor = $registry->get_adaptor( 'Human', 'Core', 'Slice' );
>> my $slice = $slice_adaptor->fetch_by_region( 'chromosome', '2',17997763,17998368);
>> 
>> my $genes = $slice->get_all_Genes();
>> while ( my $gene = shift @{$genes} ) {
>>     my $gstring = feature2string($gene);
>>     print "$gstring\n";
>> 
>> 
>>     my $transcripts = $gene->get_all_Transcripts();
>>     while ( my $transcript = shift @{$transcripts} ) {
>>         my $tstring = feature2string($transcript);
>>         print "\t$tstring\n";
>> 
>> 
>>         foreach my $exon ( @{ $transcript->get_all_Exons() } ) {
>>             my $estring = feature2string($exon);
>>             print "\t\t$estring\n";}
>> 
>>         print "printing refseq"."\n";
>>     my $refseq_xrefs = $gene->get_all_DBLinks('Refseq_mRNA');
>> foreach my $xref (@$refseq_xrefs) {
>>     print $xref->display_id . "\n";}
>> 
>>     print "printing HGNC"."\n";
>>     my $hgnc_xrefs = $gene->get_all_DBLinks('HGNC');
>> foreach my $xref (@$hgnc_xrefs) {
>>     print $xref->display_id . "\n";}
>> }
>> }
>> }
>> }
>> -------------------------------------------------------------------------------------------
>> 
>> OUTPUT :
>> 
>> ENSG00000151379: 2:1-606 (+1)
>> ENST00000281047: 2:1-606 (+1)
>> ENSE00000999265: 2:1-606 (+1)
>> printing refseq    : (GOT NULL )
>> printing HGNC
>> MSGN1
>> 
>> -------------------------------------------------------------------------------------------
>> 
>> Kindly let me know if you know why!
>> Thanks a lot,
>> 
>> --
>> Monica Manam
>> CCRIFX BioInformatics CORE
>> NCI-NIH
>> 
>> 
>> From: Magali <mr6 at ebi.ac.uk<mailto:mr6 at ebi.ac.uk>>
>> Date: Wed, 26 Jun 2013 10:58:17 +0100
>> To: Ensembl developers list <dev at ensembl.org<mailto:dev at ensembl.org>>
>> Cc: "Mac User (S072990)" <monica.manam at nih.gov<mailto:monica.manam at nih.gov>>, "Edwards, Yvonne (NIH/NCI) [C]" <yvonne.edwards at nih.gov<mailto:yvonne.edwards at nih.gov>>
>> Subject: Re: [ensembl-dev] Urgent information to extract the refseq_mrna and HGNC symbol for given gene from ENSEMBL
>> 
>> Hi Monica,
>> 
>> In your script, you are trying to call two methods which do not exist, refseq_mrna and hgnc_symbol.
>> Unless these have been defined later in your script, we do not have a direct method in ensembl to do this.
>> 
>> HGNC symbols and Refseq mRNAs are attached to a gene object as an external reference.
>> Via the API, these can be accesses using the get_all_DBLinks method.
>> 
>> For example, to get the hgnc symbol of a gene object, you could do the following:
>> my $hgnc_xrefs = $gene->get_all_DBLinks('HGNC');
>> foreach my $xref (@$hgnc_xrefs) {
>>     print $xref->display_id . "\n";
>> }
>> 
>> The same would work for RefSeq mRNAs
>> my $refseq_xrefs = $gene->get_all_DBLinks('Refseq_mRNA');
>> foreach my $xref (@$refseq_xrefs) {
>>     print $xref->display_id . "\n";
>> }
>> 
>> On a side note, in your script you set a logic_name 'refseq_human_import'.
>> If you want to restrict your study to that specific logic_name, you can specify it when fetching genes
>> my $genes = $slice->get_all_Genes($logic_name);
>> Do bear in mind though that the gene models from the refseq_human_import analysis are in the otherfeatures database for human, not in the core one.
>> Hence, you would need to change your adaptor call
>> my $slice_adaptor = $registry->get_adaptor('Human', 'otherfeatures', 'Slice');
>> 
>> Hope that helps.
>> 
>> 
>> Regards,
>> Magali
>> 
>> On 25/06/13 13:12, Manam, Monica (NIH/NCI) [F] wrote:
>> 
>> Hello Dev Team !
>> 
>> The API version used is 71.
>> 
>> I am an intern at the NIH(National Institutes Of health), Bethesda. I am working on a project & I need the ensembl API to extract the ref_seq mrna & HGNC symbol for a given gene .
>> 
>> I tried to use the following but got no results, I was getting an exception
>> "refseq_mrna not known " :
>> 
>> 
>> 
>> sub feature2string
>> 
>> {
>> 
>>     my $feature = shift;
>> 
>> 
>>     my $stable_id  = $feature->stable_id();
>> 
>>     my $seq_region = $feature->slice->seq_region_name();
>> 
>>     my $start      = $feature->start();
>> 
>>     my $end        = $feature->end();
>> 
>>     my $strand     = $feature->strand();
>> 
>> 
>>     return sprintf( "%s: %s:%d-%d (%+d)",
>> 
>>         $stable_id, $seq_region, $start, $end, $strand );
>> 
>> 
>> }
>> 
>> my $logic_name= 'refseq_human_import';
>> 
>> 
>> my $slice_adaptor = $registry->get_adaptor( 'Human', 'Core', 'Slice' );
>> 
>> my $slice = $slice_adaptor->fetch_by_region( 'chromosome', '2',17997763,17998368);
>> 
>> 
>> my $genes = $slice->get_all_Genes();
>> 
>> while ( my $gene = shift @{$genes} ) {
>> 
>>     my $gstring = feature2string($gene);
>> 
>>     print "$gstring\n";
>> 
>> 
>> 
>> 
>>  my $transcripts = $gene->get_all_Transcripts();
>> 
>>     while ( my $transcript = shift @{$transcripts} ) {
>> 
>>         my $tstring = feature2string($transcript);
>> 
>>         print "\t$tstring\n";
>> 
>>         print $logic_name."\tGene_biotype=".$gene->biotype.
>> 
>>             "\tTranscript_name=".$transcript->stable_id."\tTranscript_biotype=".$transcript->biotype."\n";
>> 
>>        print $logic_name."\tGene_refseq=".$gene->refseq_mrna."\n";
>> 
>> 
>>  print $logic_name."\tGene_hgnc=".$gene->hgnc_symbol."\n";
>> 
>> 
>> Could you kindly help me with this, asap. As this is a time constrained project.
>> 
>> 
>> Thanks,
>> 
>> Monica Manam
>> 
>> NIH- Intern 2013
>> 
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org<mailto:Dev at ensembl.org>
>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>> 
>> 
>> _______________________________________________
>> Dev mailing list    Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>> 
> 
> 
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/





More information about the Dev mailing list