[ensembl-dev] missing transcripts

mag mr6 at ebi.ac.uk
Fri Feb 28 14:51:46 GMT 2014


Hi Daniel,

There are two separate RefSeq tracks, representing two separate sets of 
data.

The 'RefSeq/ENA' track represents the set of cDNA sequences which were 
used to build the latest gene set for a given species.
For mouse, the last full build was for release 68.
This means this track would only have sequences which were available at 
the time, around February 2012.

As a full genebuild is very time consuming, we cannot re-annotate every 
single genome in its entirety every release.
To provide a more recent set, we import the full RefSeq set dumped by 
NCBI every release, currently only for human and mouse.
That data is available with the 'RefSeq import' track.


Hope that makes sense,
Magali

On 28/02/2014 14:48, Daniel Barrell wrote:
> Hi Daniel,
>
> e!75 currently contains RefSeq data from a dump that the NCBI did for 
> us on November 14th 2013. I've just looked at the human gff files and 
> they do not include NR_110375
>
> All the best
>
> Daniel
>
> On 28/02/14 14:39, Caffrey, Daniel wrote:
>> Hi Magali,
>>
>> If Ensembl 75 is using refseq sequences from December this probably 
>> explains the missing refseqs. However, the genbank/ENA 
>> accessions (JX682706, JX682707, and JX682708) were released in Aug 
>> 2013 . As the track is named refseq/ ENA I would expect them to be 
>> included (assuming version 75 is using ENA from December '13).
>>
>> Thanks,
>>
>> Daniel
>>
>>
>>
>>
>> On Feb 28, 2014, at 6:44 AM, mag <mr6 at ebi.ac.uk 
>> <mailto:mr6 at ebi.ac.uk>> wrote:
>>
>>> Hi Daniel,
>>>
>>> The two RefSeq entries you are referring to have been last updated 
>>> early February 2014 and mention a paper from late January 2014.
>>> Could it be that these entries have only been recently created, 
>>> hence would not have made it in our database at the time of update?
>>>
>>> The RefSeq track displayed in Ensembl 75 was updated end of last 
>>> year, so any more recent RefSeq entries would not be included here.
>>>
>>>
>>> Regards,
>>> Magali
>>>
>>> On 27/02/2014 19:15, Caffrey, Daniel wrote:
>>>> Hi,
>>>>
>>>> Does anyone know why the following refseq sequences/alignments do 
>>>> not appear in version 75 (or other versions) of the cDNAs 
>>>> (Refseq/ENA) track on the location tabs
>>>>
>>>> REFSEQ: NR_110420
>>>> mouse 1:150159043-150164948
>>>>
>>>>
>>>> REFSEQ: NR_110375
>>>> human: 12:125509989-125511939
>>>>
>>>> There appear to be similar issues with other transcripts - They are 
>>>> all lincRNAs and perhaps this is the reason?
>>>>
>>>>  Related to this, I know there are at least  3 splice 
>>>> variants/transcripts for the mouse gene in genbank (JX682706, 
>>>> JX682707, and JX682708) but Ensembl/gencode only lists a 
>>>> single transcript ENSMUST00000181308. Presumably cDNA data is 
>>>> sufficient evidence for inclusion within gencode?
>>>>
>>>> Thanks
>>>>
>>>> Daniel
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Dev mailing listDev at ensembl.org
>>>> Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>>>> Ensembl Blog:http://www.ensembl.info/
>>>
>>> _______________________________________________
>>> Dev mailing list Dev at ensembl.org <mailto:Dev at ensembl.org>
>>> Posting guidelines and subscribe/unsubscribe info: 
>>> http://lists.ensembl.org/mailman/listinfo/dev 
>>> <http://lists.ensembl.org/mailman/listinfo/dev>
>>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>> _______________________________________________
>> Dev mailing listDev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog:http://www.ensembl.info/
>
>
>
> _______________________________________________
> Dev mailing list    Dev at ensembl.org
> Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev
> Ensembl Blog: http://www.ensembl.info/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20140228/241a443f/attachment.html>


More information about the Dev mailing list