[ensembl-dev] missing transcripts

Caffrey, Daniel Daniel.Caffrey at umassmed.edu
Fri Feb 28 15:38:09 GMT 2014


Magali,Daniel,

Thank you for your e-mails.  Quarterly updated cDNA alignments would be sufficient for my purposes if the mouse gene builds are likely to be old (~1.5 years in this case).

However, I still cannot find an ENA track that displays JX682706, JX682707, and JX682708 (from Aug '13).  I only see AK144630 under the mouse cDNAs (Refseq/ENA) track.

http://www.ensembl.org/Mus_musculus/Location/View?r=1%3A150159043-150164948<http://www.ensembl.org/Mus_musculus/Location/View?r=1:150159043-150164948>

The Ensembl 'whats new in 75' page suggests that ENA sequences are more recent than Aug '13:

http://www.ensembl.org/Mus_musculus/Info/WhatsNew?db=core#change_1343
Relevant text from what new:
Mouse: updated cDNA alignments (Mouse)
A new cdna database was created for e75: The latest set of cDNAs for mouse (as of January 2013) from the European Nucleotide Archive and NCBI RefSeq (release 62) were aligned to the current genome using Exonerate.

Daniel


On Feb 28, 2014, at 9:51 AM, mag <mr6 at ebi.ac.uk<mailto:mr6 at ebi.ac.uk>>
 wrote:

Hi Daniel,

There are two separate RefSeq tracks, representing two separate sets of data.

The 'RefSeq/ENA' track represents the set of cDNA sequences which were used to build the latest gene set for a given species.
For mouse, the last full build was for release 68.
This means this track would only have sequences which were available at the time, around February 2012.

As a full genebuild is very time consuming, we cannot re-annotate every single genome in its entirety every release.
To provide a more recent set, we import the full RefSeq set dumped by NCBI every release, currently only for human and mouse.
That data is available with the 'RefSeq import' track.


Hope that makes sense,
Magali

On 28/02/2014 14:48, Daniel Barrell wrote:
Hi Daniel,

e!75 currently contains RefSeq data from a dump that the NCBI did for us on November 14th 2013. I've just looked at the human gff files and they do not include NR_110375

All the best

Daniel

On 28/02/14 14:39, Caffrey, Daniel wrote:
Hi Magali,

If Ensembl 75 is using refseq sequences from December this probably explains the missing refseqs. However, the genbank/ENA accessions (JX682706, JX682707, and JX682708)  were released in Aug 2013 . As the track is named refseq/ ENA I would expect them to be included (assuming version 75 is using ENA from December '13).

Thanks,

Daniel




On Feb 28, 2014, at 6:44 AM, mag <mr6 at ebi.ac.uk<mailto:mr6 at ebi.ac.uk>> wrote:

Hi Daniel,

The two RefSeq entries you are referring to have been last updated early February 2014 and mention a paper from late January 2014.
Could it be that these entries have only been recently created, hence would not have made it in our database at the time of update?

The RefSeq track displayed in Ensembl 75 was updated end of last year, so any more recent RefSeq entries would not be included here.


Regards,
Magali

On 27/02/2014 19:15, Caffrey, Daniel wrote:
Hi,

Does anyone know why the following refseq sequences/alignments do not appear in version 75 (or other versions) of the cDNAs (Refseq/ENA) track on the location tabs

REFSEQ: NR_110420
mouse 1:150159043-150164948



REFSEQ: NR_110375
human: 12:125509989-125511939

There appear to be similar issues with other transcripts - They are all lincRNAs and perhaps this is the reason?

 Related to this, I know there are at least  3 splice variants/transcripts for the mouse gene in genbank (JX682706, JX682707, and JX682708) but Ensembl/gencode only lists a single transcript ENSMUST00000181308. Presumably cDNA data is sufficient evidence for inclusion within gencode?

Thanks

Daniel





_______________________________________________
Dev mailing list    Dev at ensembl.org<mailto:Dev at ensembl.org>
Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
Ensembl Blog: http://www.ensembl.info/


_______________________________________________
Dev mailing list    Dev at ensembl.org<mailto:Dev at ensembl.org>
Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
Ensembl Blog: http://www.ensembl.info/



_______________________________________________
Dev mailing list    Dev at ensembl.org<mailto:Dev at ensembl.org>
Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
Ensembl Blog: http://www.ensembl.info/





_______________________________________________
Dev mailing list    Dev at ensembl.org<mailto:Dev at ensembl.org>
Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
Ensembl Blog: http://www.ensembl.info/


_______________________________________________
Dev mailing list    Dev at ensembl.org<mailto:Dev at ensembl.org>
Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20140228/bcd2b4ed/attachment.html>


More information about the Dev mailing list