Kirill Tsukanov ktsukanov at ebi.ac.uk
Fri May 8 01:11:35 BST 2020

Hi Thomas,

Thank you for a quick reply. I have looked into this case further, and 
it only got more interesting.

A paper <https://www.nature.com/articles/ng1827> in Nature which led to 
the discovery of this gene states that there are two transcripts 
spanning the (CTG)n repeat in 13q21.33 in the opposite directions:

 1. *ATXN8OS* (a. k. a. SCA8 & KLHL1AS), a lncRNA;
 2. *ATXN8,* a coding, nearly pure polyglutamine expansion protein.

The GenBank record DQ641254 
<https://www.ncbi.nlm.nih.gov/nuccore/DQ641254?report=GenBank> for ATXN8 
has the comment: “The sequence is derived from 3'-RACE analysis of the 
ATXN8 transcript. The 5'-end of ATXN8 mRNA is not yet defined." So this 
is what the gene status seems to reflect—that it does not have a 
/complete/ genomic mapping and annotation, not that it is invalid.

In the UCSC genome browser, this partial mRNA sequence is displayed in 
the GENCODE v32 transcript set under the accession AL160391.1:

ATXN8 region in UCSC genome browser
/(In case mailing lists won't keep the picture, here's a direct URL for 
a copy: //https://i.imgur.com/iGVwihX.png//)/

Now, if we follow up on accession AL160391.1, we will find that it is 
linked to Ensembl gene ENSG00000288330 
with the same name (AL160391.1) and description "ataxin 8". This appears 
to be the missing ATXN8 gene: it's there, it is just not linked to the 
HGNC ID (HGNC:32925 
and name. Also, the gene type is wrong: it is registered as lncRNA, 
while in reality is it a mRNA. The record is stated to have been 
manually annotated, so this appears to be a human error caused by 
confusion between the ATXN8 and ATXN8OS (which really /is/ lncRNA and is 
correctly annotated as such).

Please let me know what you think about this.


On 06/05/2020 00:09, Thomas Danhorn wrote:
> Hi Kirill,
> On the NCBI site for ATXN8 you linked to it says "not in current
> annotation release", so it looks like it may have once been considered a
> valid gene, but not anymore.  I have also looked through a few of the
> older Ensembl releases and none of them have ATXN8 on chromosome 13 (so
> this is not an omission in the new release).  The ones based on the
> GRCh37/hg19 assembly (Ensembl versions 75 and older) have "ATXN8" as a
> synonym of ENSG00000107815, but that is on chromosome 10, so I doubt that
> is what you are looking for.
> Hope this helps,
> Thomas

On Tue, 5 May 2020, Kirill Tsukanov wrote:
> Hi,
> I have a quick question about a data issue. I noticed that Ensembl 100
> includes ATXN8OS gene
> <http://www.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000230223;r=13:70107213-70149092>  
> (opposite strand lncRNA), but not the ATXN8 gene itself. The latter is
> present in NCBI Gene (https://www.ncbi.nlm.nih.gov/gene/724066), but not in
> Ensembl. This is unfortunate because it means that I can't use it in an Open
> Targets submission as it does not have an Ensembl gene ID associated with it.
> Do you know if there's a specific reason why this gene is missing? Can we
> expect it to be added in later Ensembl releases?
> -- 
> Best,
> Kirill from the European Variation Archive
