[ensembl-dev] VEP annotating same variant differently?

Konrad Karczewski konradk at broadinstitute.org
Tue May 26 04:58:32 BST 2015


Hi Will, dev team,

I've found what is appearing to be a strange issue (I always seem to find these corner cases) in VEP. Running the same VEP command twice on two files, one with ~1K variants (spanning all 22 chromosomes), and one with a single variant (that is included in the first file) appears to give a different result in the two runs for regulatory_region_variant annotations.

The two annotated files are available at http://www.broadinstitute.org/~konradk/vep/ <http://www.broadinstitute.org/~konradk/vep/>

Command line call in both cases (minus input filename):

perl /humgen/atgu1/fs03/konradk/vep/ensembl-tools-release-79/scripts/variant_effect_predictor/variant_effect_predictor.pl --everything --vcf --allele_number --no_stats --cache --offline --dir /humgen/atgu1/fs03/konradk/vep/gold/ --force_overwrite --cache_version 79 --fasta /tmp/Homo_sapiens.GRCh37.75.dna.primary_assembly.fa --assembly GRCh37 --tabix --plugin LoF,human_ancestor_fa:/humgen/atgu1/fs03/konradk/loftee_data//human_ancestor.fa.gz,filter_position:0.05,min_intron_size:15,conservation_file:mysql -i /humgen/atgu1/fs03/konradk/lof/exac_subset.vcf.gz -o /humgen/atgu1/fs03/konradk/lof/exac_subset.vep.vcf.gz

In both cases, it appears to have loaded the regulatory features, but then returns different results.

Variant when run in a larger set:

2015-05-25 22:40:17 - Retrieved 369524 regulatory features (0 mem, 395468 cached, 0 DB, 25944 duplicates)

1	78340517	.	T	C	652.97	PASS	CSQ=C|intron_variant|MODIFIER|FAM73A|ENSG00000180488|Transcript|ENST00000443751|protein_coding||14/14|ENST00000443751.2:c.1570-14T>C|||||||rs540912776|1||1|HGNC|24741||||ENSP00000393675||F8W7S1_HUMAN|UPI000206500B|||||||||||||||||||||||,C|intron_variant|MODIFIER|FAM73A|ENSG00000180488|Transcript|ENST00000370791|protein_coding||15/15|ENST00000370791.3:c.1681-14T>C|||||||rs540912776|1||1|HGNC|24741|YES||CCDS681.1|ENSP00000359827|FA73A_HUMAN|R4GMP2_HUMAN&B7ZLZ8_HUMAN|UPI00000722C6|||||||||||||||||||||||

Variant when run on its own:

2015-05-25 22:49:19 - Retrieved 372 regulatory features (0 mem, 372 cached, 0 DB, 0 duplicates)

1       78340517        .       T       C       652.97  PASS    CSQ=C|intron_variant|MODIFIER|FAM73A|ENSG00000180488|Transcript|ENST00000443751|protein_coding||14/14|ENST00000443751.2:c.1570-14T>C|||||||rs540912776|1||1|HGNC|24741||||ENSP00000393675||F8W7S1_HUMAN|UPI000206500B|||||||||||||||||||||||,C|intron_variant|MODIFIER|FAM73A|ENSG00000180488|Transcript|ENST00000370791|protein_coding||15/15|ENST00000370791.3:c.1681-14T>C|||||||rs540912776|1||1|HGNC|24741|YES||CCDS681.1|ENSP00000359827|FA73A_HUMAN|R4GMP2_HUMAN&B7ZLZ8_HUMAN|UPI00000722C6|||||||||||||||||||||||,C|regulatory_region_variant|MODIFIER|||RegulatoryFeature|ENSR00000539328|promoter_flanking_region||||||||||rs540912776|1||||||||||||||||||||||||||||||||||

-Konrad
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20150525/e85df16e/attachment.html>


More information about the Dev mailing list