[ensembl-dev] Why are ncRNA separate from cDNA?

Charles Joseph Murphy chm2059 at med.cornell.edu
Mon May 8 13:44:34 BST 2017


Thanks, Daniel! Just curious.

On May 8, 2017, at 04:33, Daniel Murphy <dmurphy at ebi.ac.uk<mailto:dmurphy at ebi.ac.uk>> wrote:


Hi Charlie,

The cDNA file contains sequences from all transcripts resulting from known, novel and pseudo gene predictions, whereas the ncRNA file contains the transcript sequences corresponding to non-coding RNA genes. While the sets of transcripts are produced by different pipelines, providing the sets as separate files makes sense as some users are more interested in only having noncoding sets over coding, for example.

Daniel

On 05/05/2017 18:11, Charles Joseph Murphy wrote:
To be more specific, why are these two files separate? Why not just have one FASTA file? I’ am asking this question because I’ am working with another individual on a python package for downloading/managing Ensembl data (https://github.com/hammerlab/pyensembl<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_hammerlab_pyensembl&d=DwMD-g&c=lb62iw4YL4RFalcE2hQUQealT9-RXrryqt9KZX2qu2s&r=O3yXKBF_L8Fov58BXORGXKqPP85pYddrOwCg4PV2BCY&m=GICRZSoAhe2BP7YElyjRWX_F3-aMNOyyb_Yl8uN0BrA&s=GIJjM-enHhnVjsyPgqKnAlxEFHGW5P6rnAeu3BLL0Lc&e=>)

ftp://ftp.ensembl.org/pub/release-88/fasta/homo_sapiens/cdna//Homo_sapiens.GRCh38.cdna.all.fa.gz<https://urldefense.proofpoint.com/v2/url?u=ftp-3A__ftp.ensembl.org_pub_release-2D88_fasta_homo-5Fsapiens_cdna__Homo-5Fsapiens.GRCh38.cdna.all.fa.gz&d=DwMD-g&c=lb62iw4YL4RFalcE2hQUQealT9-RXrryqt9KZX2qu2s&r=O3yXKBF_L8Fov58BXORGXKqPP85pYddrOwCg4PV2BCY&m=GICRZSoAhe2BP7YElyjRWX_F3-aMNOyyb_Yl8uN0BrA&s=TNfIosv92Hh6GwzohqA2ruN7G27AUTq-_1Kv5AF55cQ&e=>

ftp://ftp.ensembl.org/pub/release-88/fasta/homo_sapiens/ncrna//Homo_sapiens.GRCh38.ncrna.fa.gz<https://urldefense.proofpoint.com/v2/url?u=ftp-3A__ftp.ensembl.org_pub_release-2D88_fasta_homo-5Fsapiens_ncrna__Homo-5Fsapiens.GRCh38.ncrna.fa.gz&d=DwMD-g&c=lb62iw4YL4RFalcE2hQUQealT9-RXrryqt9KZX2qu2s&r=O3yXKBF_L8Fov58BXORGXKqPP85pYddrOwCg4PV2BCY&m=GICRZSoAhe2BP7YElyjRWX_F3-aMNOyyb_Yl8uN0BrA&s=liPZa_ztGt9sFJ1ud9-dqHkagtFW5MoBTkxBucozu4U&e=>




On May 5, 2017, at 13:06, Charles Joseph Murphy <chm2059 at med.cornell.edu<mailto:chm2059 at med.cornell.edu>> wrote:

Hi,

Just out of curiosity, why are the cDNA and ncRNA sequences in separate FASTA files? Is this due to each set of transcripts being produced via different computational pipelines?

Charlie
_______________________________________________
Dev mailing list    Dev at ensembl.org<mailto:Dev at ensembl.org>
Posting guidelines and subscribe/unsubscribe info: https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ensembl.org_mailman_listinfo_dev&d=DwICAg&c=lb62iw4YL4RFalcE2hQUQealT9-RXrryqt9KZX2qu2s&r=O3yXKBF_L8Fov58BXORGXKqPP85pYddrOwCg4PV2BCY&m=B40uzJI94tMQJCoiCjo2YDtFRCEF4iO0NmS-d5N4NTs&s=Cf-GoJ6S_a79LVttuidb-CoHo7kO3MFlLOmVA7MIDBg&e=
Ensembl Blog: https://urldefense.proofpoint.com/v2/url?u=http-3A__www.ensembl.info_&d=DwICAg&c=lb62iw4YL4RFalcE2hQUQealT9-RXrryqt9KZX2qu2s&r=O3yXKBF_L8Fov58BXORGXKqPP85pYddrOwCg4PV2BCY&m=B40uzJI94tMQJCoiCjo2YDtFRCEF4iO0NmS-d5N4NTs&s=5oDdpJxBYhEXclMvPVyBi8CUBgHL8PzI3kS0OvFKsWA&e=




_______________________________________________
Dev mailing list    Dev at ensembl.org<mailto:Dev at ensembl.org>
Posting guidelines and subscribe/unsubscribe info: http://lists.ensembl.org/mailman/listinfo/dev<https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ensembl.org_mailman_listinfo_dev&d=DwMD-g&c=lb62iw4YL4RFalcE2hQUQealT9-RXrryqt9KZX2qu2s&r=O3yXKBF_L8Fov58BXORGXKqPP85pYddrOwCg4PV2BCY&m=GICRZSoAhe2BP7YElyjRWX_F3-aMNOyyb_Yl8uN0BrA&s=eJrBRATc0AXXi6UVRitK3P7P31er6ELtxE9kaDvLIgE&e=>
Ensembl Blog: http://www.ensembl.info/<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.ensembl.info_&d=DwMD-g&c=lb62iw4YL4RFalcE2hQUQealT9-RXrryqt9KZX2qu2s&r=O3yXKBF_L8Fov58BXORGXKqPP85pYddrOwCg4PV2BCY&m=GICRZSoAhe2BP7YElyjRWX_F3-aMNOyyb_Yl8uN0BrA&s=SdLikiHuhf8AZ469if3CqNHwbNVZAx735jkIE5DpSkg&e=>


_______________________________________________
Dev mailing list    Dev at ensembl.org<mailto:Dev at ensembl.org>
Posting guidelines and subscribe/unsubscribe info: https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ensembl.org_mailman_listinfo_dev&d=DwICAg&c=lb62iw4YL4RFalcE2hQUQealT9-RXrryqt9KZX2qu2s&r=O3yXKBF_L8Fov58BXORGXKqPP85pYddrOwCg4PV2BCY&m=GICRZSoAhe2BP7YElyjRWX_F3-aMNOyyb_Yl8uN0BrA&s=eJrBRATc0AXXi6UVRitK3P7P31er6ELtxE9kaDvLIgE&e=
Ensembl Blog: https://urldefense.proofpoint.com/v2/url?u=http-3A__www.ensembl.info_&d=DwICAg&c=lb62iw4YL4RFalcE2hQUQealT9-RXrryqt9KZX2qu2s&r=O3yXKBF_L8Fov58BXORGXKqPP85pYddrOwCg4PV2BCY&m=GICRZSoAhe2BP7YElyjRWX_F3-aMNOyyb_Yl8uN0BrA&s=SdLikiHuhf8AZ469if3CqNHwbNVZAx735jkIE5DpSkg&e=

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20170508/cd552fb5/attachment.html>


More information about the Dev mailing list