Last Updated: 2 October 2014
WFN provides a mechanism for creating a visual predictive check (VPC). The nmvpc.bat command is a Windows command file that has to be edited to indicate the NM-TRAN control stream which is used to simulate values for use in the VPC. Several other options are also set in the nmvpc.bat file to specify graph types and select observation types and covariates.
The VPC run directory contains the NM-TRAN simulation control stream for simulating the model.
The nmvpc,bat, nmvpc2r.awk, vpc.R and nmvpc_functions.R files must be located in the run directory. These files should be copied from the %WFNHOME%\bin\vpc directory.
WFN must be installed with NONMEM and a Fortran compiler.
The R system must be installed with the path to the R.exe executable file in the file search path. This can be verified by typing ‘rgui’ at the command prompt in a WFN window. The R path may be set in wfn.bat e.g.
The NM-TRAN simulation control stream and the nmvpc.bat file will need to be customized before running nmvpc.bat.
A VPC is constructed from the observed values used for estimation and simulated values predicted from the final estimates and model used for estimation.
1. Use nmctl to put the final estimates from a model into the control stream used to run that model e.g.
2. Copy the control stream containing the final estimates to another directory e.g. VPC directory in the estimation model run directory. This allows the same model name to be used for estimation and VPC. Alternatively you may copy the control stream in the estimation model run directory and use a different name for the VPC control stream.
copy mymodel.ctl VPC\mymodel.ctl
3. Change to the VPC directory and edit the control stream
4. The following changes should be made to convert the estimation control stream into one suitable for VPC simulation:
a. Copy the data file to the VPC directory or edit the $DATA record to locate the data file used for estimation e.g. the following change to the datafile path could be used when the data file and VPC directory are in the estimation model run directory. Change $DATA from:
Note the additional ..\ required because WFN creates a run directory below the VPC directory.
b. Remove all $ESTIMATION, $COVARIANCE and $TABLE records
c. Insert the following line immediately after $PK (or $PRED). This creates a copy of all DV data items so that they can be saved in the table file containing the simulated values. Because NONMEM uses the variable name DV for the simulated values the observed values must be copied with a different name i.e. OBS.
d. Add the following lines to the end of $ERROR (or $PRED)
$TABLE REP ID TIME DV PRED OBS
NOAPPEND ONEHEADER NOPRINT FILE=vpc.fit
$SIM (20120402) ONLYSIM NSUB=100
e. The NONMEM internal variable IREP is used to number each simulation replication. It is saved in the table file with the name REP.
f. The $TABLE record defines the NONMEM simulation output variables required to perform a VPC
g. The $SIM record has a random number generator seed which may be changed as required. The NSUB option indicates the number of sub-problems (i.e. replications) to be performed to create the VPC. NSUB=5 is useful for exploring VPC shapes. NSUB=100 is suggested as a minimum for reliable confidence intervals.
5. If there is an MDV data item in the data file then this should be added to the list of variables in the data file
$TABLE REP ID TIME DV PRED OBS MDV ; add MDV from data file
6. A DVID variable is required in the list of variables in the data file. The DVID value should be set to 1 in the NM-TRAN code if there is only one type of observation e.g. add this to the end of your $ERROR (or $PRED) NM-TRAN code.
7. If there is more than one type of observation e.g. a PKPD model with both concentration and effect observations then a DVID variable is used distinguish the type of observation. A numerical value for each observation type should be defined in the data file and the data item name DVID used in $INPUT to distinguish different observation types.
8. If not the user should create a DVID variable based on another data item e.g. the CMT data item might be used to distinguish parent and metabolite observations:
DVID=CMT ; use the CMT data item to distinguish observation types if DVID does not exist
$TABLE REP ID TIME DV PRED OBS MDV DVID ; add a DVID variable to the table file
9. VPCs may be created using categorical covariates e.g. SEX or RACE. Categories may be created during the simulation for continuous covariates e.g.
IF (WT.LT.70) THEN
$TABLE REP ID TIME DV PRED OBS MDV DVID SEX SIZE ; add SEX and SIZE variables to the table file
The nmvpc.bat file must be edited before running it to create a VPC. The changes involve setting environment variables with the set command. Take care not to include unnecessary blank characters after the ‘=’ character. Blank characters should only be used when defining lists of models, observation types or covariates.
The essential changes are:
1. Specify if NONMEM should be run to create the simulation table file. This only has to be done once. Different VPC options may be explored with the same simulation table file. The runNONMEM variable should be set to ‘y’ the first time nmvpc.bat is run then it should be set to ‘n’ to skip the time consuming simulation step once a successful simulation has occurred.
rem set runNONMEM=y to execute nmgo for each model
2. List the names of simulation model control streams (without the file extension). Usually only one simulation model is used but several models can be used to create VPCs if a list of models is provided. Model names must be separated by a blank character.
rem Set list of models to be simulated e.g. set models=mdl1 mdl2 mdl3
3. The xname variable identifies the independent variable (usually TIME) but any other suitable variable in the simulation table file may be used e.g. time after dose (TAD).
rem Names of variable in NONMEM table file to be used for the VPC x-axis
rem This can be used for evaluating continuous covariates e.g.
rem set xnames=TIME TAD WEIGHT for total time, time after dose and weight
4. List the names of observation types to be used for the VPC. Observation names must be separated by a blank character. At least one observation name must be specified.
rem Set list of names according to observation type e.g. set obsnames=CP PCA
set obsnames=CP PCA
5. Set the options for each observation type that determine the way the VPC plot is created.
a. The bintimes variable is a list of times to be used as the centre of intervals for binning observations and predictions. The list of times should be separated by commas (no blanks). A R expression may be used to generate the list of times (see example below).
b. Either x or y axis or both axes of the VPC may be set to a logarithmic scale (base 10) by specifying ‘x’ or ‘y’ or ‘xy’ for the logaxis variable.
c. Each axis should have a label, a minimum, maximum and a tick value. Tick values should be a multiple of the minimum to maximum range. Because blanks may not be used in variable values the “#” character should be used in x and y axis labels to indicate where a blank should appear in the label.
d. A lower limit of quantitation value may be supplied. All predictions less than this lloq value will be ignored when creating the VPC.
e. Each observation type has its own section identified by a label created from “:” prefixed to the observation name specified in the obsnames list. This allows specific options to be set according to the observation type. The user must create these labels and set the options for each observation type.
f. Some options may be the same for all observation types e.g. options related to the independent variable such as bintimes and x-axis options. These options may be set in the COMMON section and do not have to be set in the OBSERVATION SPECIFIC section.
rem OBSERVATION TYPES SECTION
rem Each observation type may have its own properties
rem The dvid variable is required to distinguish types
rem Define R script variables for each observation type
rem No spaces are allowed in variable values.
rem Use '#' which will be replaced by a blank in xlabel and ylabel values
rem **** COMMON ******
rem Variables common to all observation types
rem Any of these variables may be observation (obsname) specific
rem obsname labels must correpond to names in the obsnames list
rem A obsname label must have a ":" before the obsname e.g. :CP for obsname=CP
rem **** OBSERVATION SPECIFIC ******
rem user defined obsname labels identify variables for each observation type
6. VPCs may be created by selecting observations and predictions according to covariate values. Use of covariate selection is optional. If covariate selection is not required then the covariates variable should be set to a null value (no blanks after the ‘=’):
7. If covariate selection is chosen then the covariates variable should be set to a list of one or more covariates using the name specified in the simulation control stream which determine the name in the simulation table file. For each covariate name there must be a covariate value list showing each of the covariate values to be used for VPC selection.
rem COVARIATES SECTION
rem Covariate selection is optional. For VPC without covariates: set covariates=
rem Set list of covariates (upto 3) e.g. set covariates=SEX SIZE
rem Names in the covariates list must match exactly the names in the simulation table file
set covariates=SEX SIZE
rem Each covariate name must be matched with a list of numeric values for the covariate
rem which will be used to create VPCs for each value
rem select on covariate 1 e.g. sex values 0 1
set covlist1=0 1
rem select on covariate 2 e.g. size values 1 2
set covlist2=1 2
rem select on covariate 3 e.g. race values 1 2 3
8. The MODELS SECTION contains some options that are not often changed. They apply to all the models and VPCs created by nmvpc.
a. The PIpercentile value of 0.9 will create a 90% interval (i.e. 0.05 and 0.95 percentiles) for observations and predictions. The CIpercentile value of 0.95 will create a 95% confidence interval around each of the prediction percentiles.
b. The isstd option creates a standard VPC without modifying observations or prediction values. The ispc option creates pred-corrected VPCs. Pred-correction modifies both the observations and predictions (Bergstrand et al. 2011). The iscsv option writes comma separated value format files containing the numerical values used for the VPCs. These may be used by other programs to create VPC plots.
c. The timescale option may be useful for rescaling the independent variable e.g. a timescale variable set to 1/168 could be used to rescale the time variable from hours to weeks.
d. The MDVP variable is used to identify the name of a variable in the simulation table file that specifies the MDV status for each predicted value when it is different from the original MDV status. Some simulations may create simulated values at times when the original observation was missing (MDV=1) or sometimes simulated values should be ignored at times when the original observation was present (MDV=0). E.g. if a predicted value is less than the lower limit of quantitation then a variable MDVP in the simulation control stream could be set to 1 otherwise it is set to 0. The MDVP variable should be listed in the $TABLE record and the MDVPNAME variable in nmvpc.bat set to MDVP.
e. The user may wish to modify the VPC.R script. The name of the modified R script (without the .R extension) should be specified using the vpcR variable.
rem MODELS SECTION
rem Some miscellaneous variables that are rarely changed
rem Percentile range for prediction and confidence intervals
rem if isstd=y then create standard VPCs
rem if ispc=y then create pred-corrected VPCs
rem if iscsv=y then write csv files with numerical values used for plots
rem if isbig=y then re-read simulation file each time to use less memory
rem use this to scale TIME variable (e.g. timescale=52 to scale years to weeks)
rem if hasmdv=y then use MDV data item to select valid observations otherwise all records are valid observations
rem Name for MDV item for predictions. If blank then item name will be the same as for observations (MDV).
rem name of R script for VPC (without extension)
After customizing the NM-TRAN simulation control stream and the nmvpc.bat command file the nmvpc command is used to create VPCs. Nmvpc.bat uses the nmvpc2r awk script to create a temporary R script based on vpc.R. The variables specified in nmvpc.bat are written in the temporary R script then the R batch command processor is called to execute the temporary R script.
VPCs (and csv files if requested) are created in a vpc sub-directory identified according to the observation name (and covariate name if covariate selection is used).
The vpc process is complex and it is easy to make errors. Errors in the NM-TRAN simulation will be displayed in the usual way when running nmgo.
A log of the R commands is created in a file called tmp_vpc.Rout. If there is an error in processing the R commands then this error will appear at the end of the log file.
The most common error is caused by incorrectly specifying a selection value. This causes the R script to create an inappropriate obsFile and/or simFile. If dvid is not correctly specified (e.g. a value of 3 is specified in nmvpc.bat but there are no records with a value of 3 (or all records are flagged as missing values)) or if a covariate name does not match the name in the simulation table file e.g. SOX is specified in covnames instead of SEX, then this error will occur:
Error in tapply(1L:0L, list(`cut(dat[, idvCol], br = c(0, binTimes), right = F, include.lowest = T)` = integer(0)), :
arguments must have same length
Calls: getObsPI ... do.call -> by -> by.data.frame -> eval -> eval -> tapply
When this error message occurs then check the dvid values and covnames to see if they appropriate. You may also need to check that there are non-missing values for both predictions and observations for the dvid and covnames that have been selected.