[ensembl-dev] VEP dbNSFP plugin

Michael Yourshaw myourshaw at ucla.edu
Thu Nov 14 21:35:17 GMT 2013


I’m not sure this is the right list to bring this up, but the dbNSFP plugin can cause a problem on non-windows machines.

The symptom is that if you select the ESP6500_EA_AF column option for this plugin, the Extra column in the VEP output will have a CR that might be interpreted as a line break immediately after the ESP6500_EA_AF value.

This is because the dbNSFP data file has windows-style (CRLF) newlines and ESP6500_EA_AF is the last column in each record.

An easy fix is to add a cleanup line in the dbNSFP plugin:
    while(<TABIX>) {
	chomp;
	#remove trailing CR on linux platforms
	s/\r$//g;
	my @split = split /\t/;

You might consider having VEP recode any forbidden semicolon, tab, CR, or LF (or other?) values that any plugin adds to EXTRA.

ॐ

Michael Yourshaw
UCLA Geffen School of Medicine
Department of Human Genetics, Nelson Lab
695 Charles E Young Drive S
Gonda 5554
Los Angeles CA 90095-8348 USA
myourshaw at ucla.edu
970.691.8299

This message is intended only for the use of the addressee and may contain information that is PRIVILEGED and CONFIDENTIAL, and/or may contain ATTORNEY WORK PRODUCT. If you are not the intended recipient, you are hereby notified that any dissemination of this communication is strictly prohibited. If you have received this communication in error, please erase all copies of the message and its attachments and notify us immediately. Thank you.





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ensembl.org/pipermail/dev_ensembl.org/attachments/20131114/8b40e18b/attachment.html>


More information about the Dev mailing list