[ensembl-dev] How transcriptome fasta files are created
    Julien Wollbrett 
    julien.wollbrett at unil.ch
       
    Thu Sep 27 14:53:45 BST 2018
    
    
  
Hello,
I try to understand how ensembl transcriptome fasta files are created.
I did some tests using these 2 files from release 84:
-
ftp://ftp.ensembl.org/pub/release-84/fasta/homo_sapiens/cdna/Homo_sapiens.GRCh38.cdna.all.fa.gz
-
ftp://ftp.ensembl.org/pub/release-84/gtf/homo_sapiens/Homo_sapiens.GRCh38.84.gtf.gz
I can easily understand that you filter some biotypes from the gtf in
order to create the transcriptome. Then it is normal that some
transcripts annotated in the gtf file are not present in the
transcriptome fasta file.
But I do not understand why some transcripts (15091 different
transcripts IDs) are present in the transcriptome fasta file but not in
the gtf file.
Could you please give me some information on how this transcriptome
fasta file is created?
Best regards,
Julien Wollbrett
    
    
More information about the Dev
mailing list