[ensembl-dev] Mapping between Ensembl transcripts and Refseq, specifically for ENSG00000167131
Magali
mr6 at ebi.ac.uk
Mon Mar 4 13:50:40 GMT 2013
Hi Tony,
As you rightly noticed, despite mapping to the same CCDS, ie sharing the
same coding sequence, these two transcripts do not have the same overall
sequence.
When mapping ensembl transcripts with refseq entries, both sequences are
aligned.
If the alignment is above a certain threshold, the refseq entry is added
as external reference for this transcript.
As of release 71, we are also mapping refseq entries based on
coordinates overlap.
This means that for this particular case, NM_001258395.1 is also mapped
to ENST00000410006.
Hope that helps,
Magali
On 04/03/13 13:23, Tony Håndstad wrote:
> Dear Ensembl developers.
>
> I am wondering if someone could explain how the mapping between Ensembl
> transcripts and Refseqs is done in general, and specifically, for a
> particular gene:
>
> Ensembl Gene ENSG00000167131 has five transcripts, two of these are
> ENST00000417826 and ENST00000410006 which both map to the same CCDS
> (CCDS11490).
> When selecting the longest ENST00000417826 and looking at General
> Identifiers under External References, I observe that this maps to five
> Refseqs (NM_001258395.1, NM_001258396.1, NM_001258398.1, NM_001258399.1,
> NM_213607.2). All of these are very similar, but with slightly different
> exons.
> However, for the other ENST00000410006, there are NO Refseqs that come up.
>
> Is the reason that the shorter transcript does not overlap sufficiently
> with the Refseq sequences in non-coding regions? Or could this be a bug,
> or is my understand of this perhaps too incomplete?
>
>
>
> Sincerely,
> Tony Håndstad
> Oslo University Hospital
>
> _______________________________________________
> Dev mailing list Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/
More information about the Dev
mailing list