I have created a new script, DrugClean, which prepares a drug target output file for NMPDR. The script is invoked as follows:
DrugClean -macFile fileName1 fileName2 ... fileNameN
The macFile switch is only necessary if the input files are all in Macintosh format.
The script will remove duplicate entries and entries for PEGs that are not in the current version of the Sprout database. It also converts the file to Unix format. I have changed targets.cgi to expect Unix files, so when we get new drug targets files this script must be run on them or they won't work.
I also fixed a performance problem with the organism files, and they now load in around 10 seconds instead of 50.
Leave a comment