[ensembl-dev] [Eva-dev] ATXN8 gene missing from Ensembl

Kirill Tsukanov ktsukanov at ebi.ac.uk
Fri May 8 01:11:35 BST 2020


Hi Thomas,

Thank you for a quick reply. I have looked into this case further, and 
it only got more interesting.

A paper <https://www.nature.com/articles/ng1827> in Nature which led to 
the discovery of this gene states that there are two transcripts 
spanning the (CTG)n repeat in 13q21.33 in the opposite directions:

 1. *ATXN8OS* (a. k. a. SCA8 & KLHL1AS), a lncRNA;
 2. *ATXN8,* a coding, nearly pure polyglutamine expansion protein.

The GenBank record DQ641254 
<https://www.ncbi.nlm.nih.gov/nuccore/DQ641254?report=GenBank> for ATXN8 
has the comment: “The sequence is derived from 3'-RACE analysis of the 
ATXN8 transcript. The 5'-end of ATXN8 mRNA is not yet defined." So this 
is what the gene status seems to reflect—that it does not have a 
/complete/ genomic mapping and annotation, not that it is invalid.

In the UCSC genome browser, this partial mRNA sequence is displayed in 
the GENCODE v32 transcript set under the accession AL160391.1:

ATXN8 region in UCSC genome browser
/(In case mailing lists won't keep the picture, here's a direct URL for 
a copy: //https://i.imgur.com/iGVwihX.png//)/

Now, if we follow up on accession AL160391.1, we will find that it is 
linked to Ensembl gene ENSG00000288330 
<http://www.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000288330;r=13:70137831-70139431;t=ENST00000673087> 
with the same name (AL160391.1) and description "ataxin 8". This appears 
to be the missing ATXN8 gene: it's there, it is just not linked to the 
HGNC ID (HGNC:32925 
<https://www.genenames.org/data/gene-symbol-report/#!/hgnc_id/HGNC:32925>) 
and name. Also, the gene type is wrong: it is registered as lncRNA, 
while in reality is it a mRNA. The record is stated to have been 
manually annotated, so this appears to be a human error caused by 
confusion between the ATXN8 and ATXN8OS (which really /is/ lncRNA and is 
correctly annotated as such).

Please let me know what you think about this.

Best,
Kirill

On 06/05/2020 00:09, Thomas Danhorn wrote:
> Hi Kirill,
>
> On the NCBI site for ATXN8 you linked to it says "not in current
> annotation release", so it looks like it may have once been considered a
> valid gene, but not anymore.  I have also looked through a few of the
> older Ensembl releases and none of them have ATXN8 on chromosome 13 (so
> this is not an omission in the new release).  The ones based on the
> GRCh37/hg19 assembly (Ensembl versions 75 and older) have "ATXN8" as a
> synonym of ENSG00000107815, but that is on chromosome 10, so I doubt that
> is what you are looking for.
>
> Hope this helps,
>
> Thomas

On Tue, 5 May 2020, Kirill Tsukanov wrote:
> Hi,
>
> I have a quick question about a data issue. I noticed that Ensembl 100
> includes ATXN8OS gene
> <http://www.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000230223;r=13:70107213-70149092>  
> (opposite strand lncRNA), but not the ATXN8 gene itself. The latter is
> present in NCBI Gene (https://www.ncbi.nlm.nih.gov/gene/724066), but not in
> Ensembl. This is unfortunate because it means that I can't use it in an Open
> Targets submission as it does not have an Ensembl gene ID associated with it.
>
> Do you know if there's a specific reason why this gene is missing? Can we
> expect it to be added in later Ensembl releases?
>
> -- 
> Best,
> Kirill from the European Variation Archive
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20200508/dd8a366b/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ATXN8.png
Type: image/png
Size: 31290 bytes
Desc: not available
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20200508/dd8a366b/attachment.png>


More information about the Dev mailing list