[ensembl-dev] Urgent information to extract the refseq_mrna and HGNC symbol for given gene from ENSEMBL

Manam, Monica (NIH/NCI) [F] monica.manam at nih.gov
Wed Jun 26 16:10:13 BST 2013


Hello Magali!

I have tried the code you gave me! It worked for HGNC symbol.
But for Refseq , I am not getting any value!

The gene is MSGN1 : (HGNC symbol )

My code:
-----------------------------------------------------------------------------------------------------------------------
#!/opt/nasapps/applibs/perl-5.16.2/bin/perl
use Bio::EnsEMBL::Registry;

my $registry = 'Bio::EnsEMBL::Registry';

$registry->load_registry_from_db(
    -host => 'ensembldb.ensembl.org', # alternatively 'useastdb.ensembl.org'
    -user => 'anonymous'
);

sub feature2string
{
    my $feature = shift;

    my $stable_id  = $feature->stable_id();
    my $seq_region = $feature->slice->seq_region_name();
    my $start      = $feature->start();
    my $end        = $feature->end();
    my $strand     = $feature->strand();

    return sprintf( "%s: %s:%d-%d (%+d)",
        $stable_id, $seq_region, $start, $end, $strand );
}

my $slice_adaptor = $registry->get_adaptor( 'Human', 'Core', 'Slice' );
my $slice = $slice_adaptor->fetch_by_region( 'chromosome', '2',17997763,17998368);

my $genes = $slice->get_all_Genes();
while ( my $gene = shift @{$genes} ) {
    my $gstring = feature2string($gene);
    print "$gstring\n";


    my $transcripts = $gene->get_all_Transcripts();
    while ( my $transcript = shift @{$transcripts} ) {
        my $tstring = feature2string($transcript);
        print "\t$tstring\n";


        foreach my $exon ( @{ $transcript->get_all_Exons() } ) {
            my $estring = feature2string($exon);
            print "\t\t$estring\n";}

        print "printing refseq"."\n";
    my $refseq_xrefs = $gene->get_all_DBLinks('Refseq_mRNA');
foreach my $xref (@$refseq_xrefs) {
    print $xref->display_id . "\n";}

    print "printing HGNC"."\n";
    my $hgnc_xrefs = $gene->get_all_DBLinks('HGNC');
foreach my $xref (@$hgnc_xrefs) {
    print $xref->display_id . "\n";}
}
}
}
}
-------------------------------------------------------------------------------------------

OUTPUT :

ENSG00000151379: 2:1-606 (+1)
ENST00000281047: 2:1-606 (+1)
ENSE00000999265: 2:1-606 (+1)
printing refseq    : (GOT NULL )
printing HGNC
MSGN1

-------------------------------------------------------------------------------------------

Kindly let me know if you know why!
Thanks a lot,

--
Monica Manam
CCRIFX BioInformatics CORE
NCI-NIH


From: Magali <mr6 at ebi.ac.uk<mailto:mr6 at ebi.ac.uk>>
Date: Wed, 26 Jun 2013 10:58:17 +0100
To: Ensembl developers list <dev at ensembl.org<mailto:dev at ensembl.org>>
Cc: "Mac User (S072990)" <monica.manam at nih.gov<mailto:monica.manam at nih.gov>>, "Edwards, Yvonne (NIH/NCI) [C]" <yvonne.edwards at nih.gov<mailto:yvonne.edwards at nih.gov>>
Subject: Re: [ensembl-dev] Urgent information to extract the refseq_mrna and HGNC symbol for given gene from ENSEMBL

Hi Monica,

In your script, you are trying to call two methods which do not exist, refseq_mrna and hgnc_symbol.
Unless these have been defined later in your script, we do not have a direct method in ensembl to do this.

HGNC symbols and Refseq mRNAs are attached to a gene object as an external reference.
Via the API, these can be accesses using the get_all_DBLinks method.

For example, to get the hgnc symbol of a gene object, you could do the following:
my $hgnc_xrefs = $gene->get_all_DBLinks('HGNC');
foreach my $xref (@$hgnc_xrefs) {
    print $xref->display_id . "\n";
}

The same would work for RefSeq mRNAs
my $refseq_xrefs = $gene->get_all_DBLinks('Refseq_mRNA');
foreach my $xref (@$refseq_xrefs) {
    print $xref->display_id . "\n";
}

On a side note, in your script you set a logic_name 'refseq_human_import'.
If you want to restrict your study to that specific logic_name, you can specify it when fetching genes
my $genes = $slice->get_all_Genes($logic_name);
Do bear in mind though that the gene models from the refseq_human_import analysis are in the otherfeatures database for human, not in the core one.
Hence, you would need to change your adaptor call
my $slice_adaptor = $registry->get_adaptor('Human', 'otherfeatures', 'Slice');

Hope that helps.


Regards,
Magali

On 25/06/13 13:12, Manam, Monica (NIH/NCI) [F] wrote:

Hello Dev Team !

The API version used is 71.

I am an intern at the NIH(National Institutes Of health), Bethesda. I am working on a project & I need the ensembl API to extract the ref_seq mrna & HGNC symbol for a given gene .

I tried to use the following but got no results, I was getting an exception
"refseq_mrna not known " :



sub feature2string

{

    my $feature = shift;


    my $stable_id  = $feature->stable_id();

    my $seq_region = $feature->slice->seq_region_name();

    my $start      = $feature->start();

    my $end        = $feature->end();

    my $strand     = $feature->strand();


    return sprintf( "%s: %s:%d-%d (%+d)",

        $stable_id, $seq_region, $start, $end, $strand );


}

my $logic_name= 'refseq_human_import';


my $slice_adaptor = $registry->get_adaptor( 'Human', 'Core', 'Slice' );

my $slice = $slice_adaptor->fetch_by_region( 'chromosome', '2',17997763,17998368);


my $genes = $slice->get_all_Genes();

while ( my $gene = shift @{$genes} ) {

    my $gstring = feature2string($gene);

    print "$gstring\n";




 my $transcripts = $gene->get_all_Transcripts();

    while ( my $transcript = shift @{$transcripts} ) {

        my $tstring = feature2string($transcript);

        print "\t$tstring\n";

        print $logic_name."\tGene_biotype=".$gene->biotype.

            "\tTranscript_name=".$transcript->stable_id."\tTranscript_biotype=".$transcript->biotype."\n";

       print $logic_name."\tGene_refseq=".$gene->refseq_mrna."\n";


 print $logic_name."\tGene_hgnc=".$gene->hgnc_symbol."\n";


Could you kindly help me with this, asap. As this is a time constrained project.


Thanks,

Monica Manam

NIH- Intern 2013

_______________________________________________
Dev mailing list    Dev at ensembl.org<mailto:Dev at ensembl.org>
Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
Ensembl Blog: http://www.ensembl.info/





More information about the Dev mailing list