[ensembl-dev] RFAM annotations inconsistent
Sabine Reißer
sabine.reisser at mdc-berlin.de
Mon Jun 8 16:04:34 BST 2020
Dear Ensembl developers,
I've come across some inconsistencies regarding RFAM families of
bacterial sRNA in bacteria.ensembl.
If I look for the sRNA ArcZ in ensembl, I get as a result e.g.
EBT00001618347 citrobacter_koseri_atcc_baa_895. This transcript has as
annotation method: "Non-coding RNA gene models based on alignment by of
RFAM families to genomic sequences (alignments provided by RFAM)"
I find this transcript also in the ArcZ family on RFAM:
https://rfam.xfam.org/family/RF00081#tabview=tab1
However, the sequence at the ensembl coordinates is reverse complement
to the (correct) sequence in RFAM. In this case, the correct sequence is
on the forward strand, while the ensembl coordinates give the backward
strand.
I encountered several such cases where the correct sequence can be on
any strand. EBT00001534313 citrobacter_rodentium_icc168 is an example
were the correct strand is backward but ensembl says forward.
Is it possible that there's simply a sign error on the annotation import
from RFAM? Or am I missing something?
It would be great if you could check this.
With best regards
Sabine
--
Dr. Sabine Reißer
Postdoctoral researcher
Bioinformatics of RNA Structure and Transcriptome Regulation
Berlin Institute for Medical Systems Biology
Max Delbrück Center for Molecular Medicine in the Helmholtz Association
Hannoversche Str. 28, 10115 Berlin, Germany
Tel.: +49 30 9406-3294
sabine.reisser at mdc-berlin.de
www.mdc-berlin.de
More information about the Dev
mailing list