[ensembl-dev] Homo_sapiens.GRCh38.92.chr.gtf contents compared to fasta files (cdna + ncrna)
Vivek Iyer
vvi at sanger.ac.uk
Wed Sep 19 10:50:44 BST 2018
Hi all,
From the downloadable data on ftp://ftp.ensembl.org/pub/release-92/gtf/ <ftp://ftp.ensembl.org/pub/release-92/gtf/> I can see one gtf file for download (I’m using v92 at the moment): Homo_sapiens.GRCh38.92.chr.gtf
Are the transcripts in here a superset / subset or the identical to the combined transcripts in the sum of these two fasta files under ftp://ftp.ensembl.org/pub/release-92/fasta/homo_sapiens/:
Homo_sapiens.GRCh38.cdna.all.fa
Homo_sapiens.GRCh38.ncrna.fa
Of course, I could resolve the IDs and do a simple comparison :-) I was hoping someone could point me at docs (along with a nudge to RTFM) or supply some motivation for the split. Both types of files are needed at different points of an RNAseq pipeline.
Thanks,
Vivek
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20180919/469716ba/attachment.html>
More information about the Dev
mailing list