Visual Predictive Check

 

Home | Installation | Control Streams | Bootstrap | Randomization Test | Visual Predictive Check | Autocovariate | Files | References

 

Last Updated: 2 October 2014

 

Visual Predictive Checks

 

WFN provides a mechanism for creating a visual predictive check (VPC). The nmvpc.bat command is a Windows command file that has to be edited to indicate the NM-TRAN control stream which is used to simulate values for use in the VPC. Several other options are also set in the nmvpc.bat file to specify graph types and select observation types and covariates.

 

Installation

The VPC run directory contains the NM-TRAN simulation control stream for simulating the model.

 

The nmvpc,bat, nmvpc2r.awk, vpc.R and nmvpc_functions.R files must be located in the run directory. These files should be copied from the %WFNHOME%\bin\vpc directory.

 

WFN must be installed with NONMEM and a Fortran compiler.

The R system must be installed with the path to the R.exe executable file in the file search path. This can be verified by typing rgui at the command prompt in a WFN window. The R path may be set in wfn.bat e.g.

 

set RPATH=C:\Apps\R-2.14.2\bin\i386

 

 

The NM-TRAN simulation control stream and the nmvpc.bat file will need to be customized before running nmvpc.bat.

 

Customizing the NM-TRAN Simulation Control Stream

A VPC is constructed from the observed values used for estimation and simulated values predicted from the final estimates and model used for estimation.

1.     Use nmctl to put the final estimates from a model into the control stream used to run that model e.g.

nmctl mymodel.ctl

2.     Copy the control stream containing the final estimates to another directory e.g. VPC directory in the estimation model run directory. This allows the same model name to be used for estimation and VPC. Alternatively you may copy the control stream in the estimation model run directory and use a different name for the VPC control stream.

mkdir VPC

copy mymodel.ctl VPC\mymodel.ctl

3.     Change to the VPC directory and edit the control stream

cd VPC

edit mymodel.ctl

4.     The following changes should be made to convert the estimation control stream into one suitable for VPC simulation:

a.     Copy the data file to the VPC directory or edit the $DATA record to locate the data file used for estimation e.g. the following change to the datafile path could be used when the data file and VPC directory are in the estimation model run directory. Change $DATA from:

$DATA ka1_to_emax1_data.csv

to

$DATA ..\..\ka1_to_emax1_data.csv

Note the additional ..\ required because WFN creates a run directory below the VPC directory.

b.     Remove all $ESTIMATION, $COVARIANCE and $TABLE records

c.     Insert the following line immediately after $PK (or $PRED). This creates a copy of all DV data items so that they can be saved in the table file containing the simulated values. Because NONMEM uses the variable name DV for the simulated values the observed values must be copied with a different name i.e. OBS.

OBS=DV

d.     Add the following lines to the end of $ERROR (or $PRED)

REP=IREP

$TABLE REP ID TIME DV PRED OBS

NOAPPEND ONEHEADER NOPRINT FILE=vpc.fit

$SIM (20120402) ONLYSIM NSUB=100

 

e.     The NONMEM internal variable IREP is used to number each simulation replication. It is saved in the table file with the name REP.

f.     The $TABLE record defines the NONMEM simulation output variables required to perform a VPC

g.     The $SIM record has a random number generator seed which may be changed as required. The NSUB option indicates the number of sub-problems (i.e. replications) to be performed to create the VPC. NSUB=5 is useful for exploring VPC shapes. NSUB=100 is suggested as a minimum for reliable confidence intervals.

 

5.     If there is an MDV data item in the data file then this should be added to the list of variables in the data file

$TABLE REP ID TIME DV PRED OBS MDV ; add MDV from data file

 

6.     A DVID variable is required in the list of variables in the data file. The DVID value should be set to 1 in the NM-TRAN code if there is only one type of observation e.g. add this to the end of your $ERROR (or $PRED) NM-TRAN code.

 

DVID=1

 

7.     If there is more than one type of observation e.g. a PKPD model with both concentration and effect observations then a DVID variable is used distinguish the type of observation. A numerical value for each observation type should be defined in the data file and the data item name DVID used in $INPUT to distinguish different observation types.

 

8.     If not the user should create a DVID variable based on another data item e.g. the CMT data item might be used to distinguish parent and metabolite observations:

 

DVID=CMT ; use the CMT data item to distinguish observation types if DVID does not exist

$TABLE REP ID TIME DV PRED OBS MDV DVID ; add a DVID variable to the table file

 

9.     VPCs may be created using categorical covariates e.g. SEX or RACE. Categories may be created during the simulation for continuous covariates e.g.

IF (WT.LT.70) THEN

SIZE=1

ELSE

SIZE=2

ENDIF

$TABLE REP ID TIME DV PRED OBS MDV DVID SEX SIZE ; add SEX and SIZE variables to the table file

 

Customizing the nmvpc.bat Command File

The nmvpc.bat file must be edited before running it to create a VPC. The changes involve setting environment variables with the set command. Take care not to include unnecessary blank characters after the = character. Blank characters should only be used when defining lists of models, observation types or covariates.

 

The essential changes are:

1.     Specify if NONMEM should be run to create the simulation table file. This only has to be done once. Different VPC options may be explored with the same simulation table file. The runNONMEM variable should be set to y the first time nmvpc.bat is run then it should be set to n to skip the time consuming simulation step once a successful simulation has occurred.

 

:nonmem

rem set runNONMEM=y to execute nmgo for each model

set runNONMEM=y

 

2.     List the names of simulation model control streams (without the file extension). Usually only one simulation model is used but several models can be used to create VPCs if a list of models is provided. Model names must be separated by a blank character.

 

rem Set list of models to be simulated e.g. set models=mdl1 mdl2 mdl3

set models=ka1_to_emax1_simln

 

3.     The xname variable identifies the independent variable (usually TIME) but any other suitable variable in the simulation table file may be used e.g. time after dose (TAD).

 

rem Names of variable in NONMEM table file to be used for the VPC x-axis

rem This can be used for evaluating continuous covariates e.g.

rem set xnames=TIME TAD WEIGHT for total time, time after dose and weight

set xnames=TIME

 

 

4.     List the names of observation types to be used for the VPC. Observation names must be separated by a blank character. At least one observation name must be specified.

 

rem Set list of names according to observation type e.g. set obsnames=CP PCA

set obsnames=CP PCA

 

5.      Set the options for each observation type that determine the way the VPC plot is created.

a.     The bintimes variable is a list of times to be used as the centre of intervals for binning observations and predictions. The list of times should be separated by commas (no blanks). A R expression may be used to generate the list of times (see example below).

b.     Either x or y axis or both axes of the VPC may be set to a logarithmic scale (base 10) by specifying x or y or xy for the logaxis variable.

c.     Each axis should have a label, a minimum, maximum and a tick value. Tick values should be a multiple of the minimum to maximum range. Because blanks may not be used in variable values the # character should be used in x and y axis labels to indicate where a blank should appear in the label.

d.     A lower limit of quantitation value may be supplied. All predictions less than this lloq value will be ignored when creating the VPC.

e.     Each observation type has its own section identified by a label created from : prefixed to the observation name specified in the obsnames list. This allows specific options to be set according to the observation type. The user must create these labels and set the options for each observation type.

f.     Some options may be the same for all observation types e.g. options related to the independent variable such as bintimes and x-axis options. These options may be set in the COMMON section and do not have to be set in the OBSERVATION SPECIFIC section.

 

 

rem *************************

rem OBSERVATION TYPES SECTION

rem *************************

:obstype

rem Each observation type may have its own properties

rem The dvid variable is required to distinguish types

 

rem Define R script variables for each observation type

rem No spaces are allowed in variable values.

rem Use '#' which will be replaced by a blank in xlabel and ylabel values

 

rem **** COMMON ******

rem Variables common to all observation types

rem Any of these variables may be observation (obsname) specific

set bintimes=c(seq(0,10,1),seq(12,144,12))

 

set logaxis=

set xlabel=Hour

set xmin=0

set xmax=144

set xtick=12

set lloq=0

 

rem obsname labels must correpond to names in the obsnames list

rem A obsname label must have a ":" before the obsname e.g. :CP for obsname=CP

 

goto %obsname%

 

rem **** OBSERVATION SPECIFIC ******

rem user defined obsname labels identify variables for each observation type

 

:CP

set dvid=1

set ylabel=%obsname%#mg/L

set ymin=0

set ymax=20

set ytick=5

 

goto select

 

:PCA

set dvid=2

set ylabel=%obsname%#%

set ymin=0

set ymax=120

set ytick=20

 

goto select

 

6.     VPCs may be created by selecting observations and predictions according to covariate values. Use of covariate selection is optional. If covariate selection is not required then the covariates variable should be set to a null value (no blanks after the =):

set covariates=

 

7.     If covariate selection is chosen then the covariates variable should be set to a list of one or more covariates using the name specified in the simulation control stream which determine the name in the simulation table file. For each covariate name there must be a covariate value list showing each of the covariate values to be used for VPC selection.

 

rem **********************

rem COVARIATES SECTION

rem **********************

:covariate

rem Covariate selection is optional. For VPC without covariates: set covariates=

rem Set list of covariates (upto 3) e.g. set covariates=SEX SIZE

rem Names in the covariates list must match exactly the names in the simulation table file

set covariates=SEX SIZE

rem Each covariate name must be matched with a list of numeric values for the covariate

rem which will be used to create VPCs for each value

rem select on covariate 1 e.g. sex values 0 1

set covlist1=0 1

rem select on covariate 2 e.g. size values 1 2

set covlist2=1 2

rem select on covariate 3 e.g. race values 1 2 3

set covlist3=

 

8.     The MODELS SECTION contains some options that are not often changed. They apply to all the models and VPCs created by nmvpc.

a.     The PIpercentile value of 0.9 will create a 90% interval (i.e. 0.05 and 0.95 percentiles) for observations and predictions. The CIpercentile value of 0.95 will create a 95% confidence interval around each of the prediction percentiles.

b.     The isstd option creates a standard VPC without modifying observations or prediction values. The ispc option creates pred-corrected VPCs. Pred-correction modifies both the observations and predictions (Bergstrand et al. 2011). The iscsv option writes comma separated value format files containing the numerical values used for the VPCs. These may be used by other programs to create VPC plots.

c.     The timescale option may be useful for rescaling the independent variable e.g. a timescale variable set to 1/168 could be used to rescale the time variable from hours to weeks.

d.     The MDVP variable is used to identify the name of a variable in the simulation table file that specifies the MDV status for each predicted value when it is different from the original MDV status. Some simulations may create simulated values at times when the original observation was missing (MDV=1) or sometimes simulated values should be ignored at times when the original observation was present (MDV=0). E.g. if a predicted value is less than the lower limit of quantitation then a variable MDVP in the simulation control stream could be set to 1 otherwise it is set to 0. The MDVP variable should be listed in the $TABLE record and the MDVPNAME variable in nmvpc.bat set to MDVP.

e.     The user may wish to modify the VPC.R script. The name of the modified R script (without the .R extension) should be specified using the vpcR variable.

 

rem **********************

rem MODELS SECTION

rem **********************

:models

 

rem Some miscellaneous variables that are rarely changed

 

rem Percentile range for prediction and confidence intervals

set PIpercentile=0.9

set CIpercentile=0.95

rem if isstd=y then create standard VPCs

set isstd=y

rem if ispc=y then create pred-corrected VPCs

set ispc=y

rem if iscsv=y then write csv files with numerical values used for plots

set iscsv=n

rem if isbig=y then re-read simulation file each time to use less memory

set isbig=n

rem use this to scale TIME variable (e.g. timescale=52 to scale years to weeks)

set timescale=1

rem if hasmdv=y then use MDV data item to select valid observations otherwise all records are valid observations

set hasmdv=y

rem Name for MDV item for predictions. If blank then item name will be the same as for observations (MDV).

set mdvpname=

rem name of R script for VPC (without extension)

set vpcR=vpc

goto gotcov

 

Running nmvpc

After customizing the NM-TRAN simulation control stream and the nmvpc.bat command file the nmvpc command is used to create VPCs. Nmvpc.bat uses the nmvpc2r awk script to create a temporary R script based on vpc.R. The variables specified in nmvpc.bat are written in the temporary R script then the R batch command processor is called to execute the temporary R script.

 

VPCs (and csv files if requested) are created in a vpc sub-directory identified according to the observation name (and covariate name if covariate selection is used).

 

Debugging

The vpc process is complex and it is easy to make errors. Errors in the NM-TRAN simulation will be displayed in the usual way when running nmgo.

 

A log of the R commands is created in a file called tmp_vpc.Rout. If there is an error in processing the R commands then this error will appear at the end of the log file.

 

The most common error is caused by incorrectly specifying a selection value. This causes the R script to create an inappropriate obsFile and/or simFile. If dvid is not correctly specified (e.g. a value of 3 is specified in nmvpc.bat but there are no records with a value of 3 (or all records are flagged as missing values)) or if a covariate name does not match the name in the simulation table file e.g. SOX is specified in covnames instead of SEX, then this error will occur:

 

Error in tapply(1L:0L, list(`cut(dat[, idvCol], br = c(0, binTimes), right = F, include.lowest = T)` = integer(0)), :

arguments must have same length

Calls: getObsPI ... do.call -> by -> by.data.frame -> eval -> eval -> tapply

Execution halted


When this error message occurs then check the dvid values and covnames to see if they appropriate. You may also need to check that there are non-missing values for both predictions and observations for the dvid and covnames that have been selected.

 

Examples

 

 

Home | Installation | Control Streams | Bootstrap | Randomization Test | Visual Predictive Check | Autocovariate | Files | References