[ensembl-dev] VEP on 37, but Gencode 25?

Will McLaren wm2 at ebi.ac.uk
Tue Sep 27 15:58:53 BST 2016


You can try running it with --verbose, it will give you some error logging.

Will

On 27 September 2016 at 15:56, Konrad Karczewski <konradk at broadinstitute.org
> wrote:

> Ok good to know - I actually tried it, but I think something is being odd.
> It gets through the whole thing (going back and forth between chromosomes
> like you said, so I can try to fix that), but then appears to finish:
>
> 2016-09-26 16:12:30 - Processing chromosome Y
> WARNING: Could not find chromosome named M in FASTA file
> 2016-09-26 16:12:52 - All done!
>
> But the output directory (either ~/.vep or the directory I pointed to with
> --dir) are empty. Is this a related issue? Thought you might want to know
> to add a bit of error logging if so.
>
> -Konrad
>
> On September 27, 2016 at 8:30:15 AM, Will McLaren (wm2 at ebi.ac.uk) wrote:
>
> In theory this should work, but the gtf2vep.pl script doesn't seem to
> work too well with this particular GFF (it was designed really to work with
> GFF/GTFs as produced by Ensembl or NCBI). Probably with some tweaks it
> could be made to work - I believe the major issues are caused by features
> being out of the order that the script expects.
>
> The new code uses a much more robust system for constructing transcripts
> and has been tested with GFFs from Ensembl, NCBI and GENCODE.
>
> Will
>
> On 27 September 2016 at 13:22, Konrad Karczewski <
> konradk at broadinstitute.org> wrote:
>
>> I just also realized - would creating a cache from this gff file (using
>> gtf2vep.pl) not be recommended?
>>
>> -Konrad
>>
>> On September 27, 2016 at 5:16:42 AM, Will McLaren (wm2 at ebi.ac.uk) wrote:
>>
>> Hi Konrad,
>>
>> The beta ensembl-vep code [1] supports annotation directly from a GFF
>> file, such as the one available from the GENCODE website [2].
>>
>> $ curl ftp://ftp.sanger.ac.uk/pub/gencode/Gencode_human/releas
>> e_25/GRCh37_mapping/gencode.v25lift37.annotation.gff3.gz | gzip -dc |
>> grep -v "#" | sort -k1,1 -k4,4n -k5,5n | bgzip -c >
>> gencode.v25lift37.annotation.gff3.gz
>> $ tabix -p gff gencode.v25lift37.annotation.gff3.gz
>> $ perl vep.pl -i variants.vcf -gff gencode.v25lift37.annotation.gff3.gz
>> -fasta homo_sapiens.fa
>>
>> This comes with limitations as the GFF file contains only the transcript
>> structure and not any of the additional annotations. However I do know of
>> someone successfully using LOFTEE with this exact setup.
>>
>> Of course usual beta caveats apply, so if you do use it and find bugs
>> please report on the GitHub page.
>>
>> Regards
>>
>> Will McLaren
>> Ensembl Variation
>>
>> [1] : https://github.com/willmclaren/ensembl-vep
>> [2] : http://www.gencodegenes.org/releases/25lift37.html
>>
>> On 26 September 2016 at 20:40, Konrad Karczewski <
>> konradk at broadinstitute.org> wrote:
>>
>>> Hi all,
>>>
>>> When running VEP 85 on GRCh37, I believe the process has been to
>>> annotate against Gencode 19 (the info.txt seems to confirm this). Realizing
>>> the ridiculousness of my request, is there any chance there is a cache
>>> floating around for Gencode 25lift37? Would go a long way for ExAC
>>> releases.
>>>
>>> Thanks!
>>> -Konrad
>>>
>>> _______________________________________________
>>> Dev mailing list    Dev at ensembl.org
>>> Posting guidelines and subscribe/unsubscribe info:
>>> http://lists.ensembl.org/mailman/listinfo/dev
>>> Ensembl Blog: http://www.ensembl.info/
>>>
>>>
>> _______________________________________________
>> Dev mailing list Dev at ensembl.org
>> Posting guidelines and subscribe/unsubscribe info:
>> http://lists.ensembl.org/mailman/listinfo/dev
>> Ensembl Blog: http://www.ensembl.info/
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20160927/e280bd23/attachment.html>


More information about the Dev mailing list