[ensembl-dev] Gene symbols mapping to multiple Ensembl gene IDs

mag mr6 at ebi.ac.uk
Mon Apr 27 16:26:58 BST 2015


Hi Michael,

The 'is_reference' flag for genes is not always set, as it requires 
manual annotation that is not available for all genes.
Furthermore, the reference gene could be on a haplotype if the sequence 
in the primary assembly is not the most representative one.

Once you have all the genes matching a given HGNC symbol though, you can 
check whether the slice they belong to is part of the primary assembly.
The following should allow you to do this:
if ($gene->slice->is_reference) {
   print "This is gene belong to the reference assembly\n";
}


Hope that helps,
Magali

On 27/04/2015 15:55, Michael Maguire wrote:
> Hi Ensembl
> A number of HGNC gene symbols map to multiple Ensemble gene IDs. For 
> example, gene symbol "APOM" maps to these Ensembl gene IDs:
> ENSG00000224290 ENSG00000235754 ENSG00000204444 ENSG00000231974 
> ENSG00000206409 ENSG00000227567 ENSG00000226215
>
> I would like to know which of these relates to the reference assembly. 
> I have used the Perl API to look at APOM IDs listed above. Judging by 
> the "seq_region_name" value in the gene info hash returned by the gene 
> method "summary_as_hash()" appears to be one I need is 
> "ENSG00000204444". I have tried the gene "is_reference()" method but 
> it returns 0 and a warning for all these IDs.
>
> Is there a method to return this information?
>
> Thank you
>





More information about the Dev mailing list