Automatic Search for Covariate Model
Home | Installation | Control Streams | Bootstrap | Randomization Test | Visual Predictive Check | Autocovariate | Files | References
Last Updated: 3 April 2012
A common task for NONMEM users is to discover which covariates are influential in predicting the main model parameters such as clearance or EC50. Automated approaches to this problem have been suggested (Mandema, Verotta & Sheiner 1992) in the literature. Jonsson & Karlsson (1997) have implemented this method in xpose2. Both of these methods rely on the use of post hoc parameter estimates from a single baseline run without covariates. Niclas Jonsson has implemented an automated way of searching for covariate models which involves a full NONMEM run at each step (gam42). The autocovariate model search method implemented in WFN is based heavily on the ideas used in gam42 and I appreciate the discussions with Niclas and Mats on this topic.
CAUTION: There has been well founded criticism of automated stepwise selection procedures used for model building (e.g. Wright 2001 ) and it should not be assumed that covariate models developed using nmac or other automated procedures will find models with good predictive abilities. Covariate models should only be seriously considered if you have a reasonable a priori mechanistic/biological reason for accepting them in the overall model e.g. weight would always seem to be a reasonable covariate for explaining between subject differences in clearance but it would require some very special rationale to ever consider weight as an explanatory covariate for EC50.
The criteria for forward inclusion and backward
elimination from the model are defined by the environment variables nmacfwd and
nmacback. The default values are nmacfwd=3.841 (alpha=0.05, one df i.e. one parameter
added) and nmacback=10.828 (alpha=0.001, one df i.e. one parameter removed).
Autocovariate Records
Autocovariate model records are identified by ";!" in the first two characters of the record. All covariate models must be specified in a conditional block and the logical expression must include the covariate data item name. More than one main model parameter may be included in each covariate conditional block.
The covariate effect variables must appear in a single autocovariate model record after the covariate model conditional blocks e.g.
;! fe0=fage
1.Age as a Covariate on E0
The following code (theoe0.ctl) illustrates an autocovariate control stream using the theopd.dat data set and theopd.ctl as the base model. Those THETAs which are parameters of covariate models must have parameter names which include the covariate data item name e.g. the 4th THETA below has the name eoage which is therefore linked with the age covariate model. An empirical exponential model [fage=exp(e0age*(age-40))] is used to define the fractional effect of AGE on E0 centered on a standard individual with age=40.
$PROB Theophylline pharmacodynamics
$DATA ..\..\..\theopd.dat IGNORE # ; path points to theopd.dat in wfn\run
directory
$INPUT ID TIME THEO AGE WT SEX RACE DIAG PEFR=DV
$ESTIM MAX=9990
$THETA (0,146.,) ; pope0
$THETA (0,165.,) ; popemax
$THETA (.001,6.57,) ; popec50
$THETA 0. FIX ; e0age
$OMEGA 0.00178 ; etae0
$OMEGA 0.239 ; etaemax
$OMEGA 1.46 ; etaec50
$SIGMA 6590. ; errsd
$PRED
fe0=1 ; must include a default value for each covariate effect variable
;! if (age.gt.0) then
;! fage=exp(e0age*(age-40))
;! else
;! fage=1
;! endif
;! fe0=fage
TVE0=fe0*popE0
E0=TVE0*EXP(etaE0)
EMAX=popEMAX*EXP(etaEMAX)
EC50=popEC50*EXP(etaEC50)
Y = E0 + EMAX*THEO/(THEO+EC50) + errSD
2.Age and Sex as Covariates on E0
The effects of age and sex are shown in the theoe02.ctl control stream. This example shows how covariates can be combined using a multiplicative effect of AGE and SEX on E0. It also shows how a specific initial value for the covariate parameter can be defined by enclosing it in square brackets. The initial value of eoage will be -0.01. If no explicit value is given then the default value for a parameter with a FIXed value of 0 will be 0.01.
$PROB Theophylline pharmacodynamics autocovariate
$DATA ..\..\..\theopd.dat IGNORE #
$INPUT ID TIME THEO AGE WT SEX RACE DIAG PEFR=DV
$ESTIM MAX=9990
$THETA (0,146.,) ; pope0
$THETA (0,165.,) ; popemax
$THETA (.001,6.57,) ; popec50
$THETA 0. FIX ; e0age [-.01]
$THETA 1. FIX ; e0sex
$OMEGA 0.00178 ; etae0
$OMEGA 0.239 ; etaemax
$OMEGA 1.46 ; etaec50
$SIGMA 6590. ; errsd
$PRED
FE0=1
;! if (age.gt.0) then
;! fage=exp(e0age*(age-40))
;! else
;! fage=1
;! endif
;! if (sex.eq.0) then ;female
;! fsex=e0sex
;! else
;! fsex=1
;! endif
;! fe0=fage*fsex
TVE0=fe0*popE0
E0=TVE0*EXP(etaE0)
EMAX=popEMAX*EXP(etaEMAX)
EC50=popEC50*EXP(etaEC50)
Y = E0 + EMAX*THEO/(THEO+EC50) + errSD
3.Five Covariates on E0, Emax and EC50
This is the ( theoe0m5.ctl) autocovariate model extended control stream which illustrates how automatic covariate models can be built for each of the main model parameters, e0, emax and EC50.
$PROB theophylline
pharmacodynamics autocovariate control stream
$DATA theopd.dat
IGNORE #
$INPUT ID TIME THEO
AGE WT SEX RACE DIAG PEFR=DV
$ESTIM MAX=9990
$THETA (0,146.,) ;
pope0
$THETA (0,165.,) ;
popemax
$THETA (.001,6.57,) ;
popec50
$THETA 0. FIX ; e0age
$THETA 0. FIX ; e0wt
$THETA 1. FIX ; e0sex
$THETA 1. FIX ; e0race
$THETA 1. FIX ; e0diag
$THETA 0. FIX ; emage
$THETA 0. FIX ; emwt
$THETA 1. FIX ; emsex
$THETA 1. FIX ; emrace
$THETA 1. FIX ; emdiag
$THETA 0. FIX ; c5age
$THETA 0. FIX ; c5wt
$THETA 1. FIX ; c5sex
$THETA 1. FIX ; c5race
$THETA 1. FIX ; c5diag
$OMEGA 0.00178 ;
etae0
$OMEGA 0.239 ; etaemax
$OMEGA 1.46 ; etaec50
$SIGMA 6590. ; errsd
$PRED
FE0=1
FEMAX=1
FEC50=1
;! if (age.gt.0) then
;!
fag0=exp(e0age*(age-40))
;!
fagm=exp(emage*(age-40))
;!
fag5=exp(c5age*(age-40))
;! else
;! fag0=1
;! fagm=1
;! fag5=1
;! endif
;! if (wt.gt.0) then
;!
fwt0=exp(e0wt*(wt-70))
;!
fwtm=exp(emwt*(wt-70))
;!
fwt5=exp(c5wt*(wt-70))
;! else
;! fwt0=1
;! fwtm=1
;! fwt5=1
;! endif
;! if (sex.eq.0) then
;female
;! fsx0=e0sex
;! fsxm=emsex
;! fsx5=c5sex
;! else
;! fsx0=1
;! fsxm=1
;! fsx5=1
;! endif
;! if (race.eq.2) then
;Polynesian
;! fra0=e0race
;! fram=emrace
;! fra5=c5race
;! else
;! fra0=1
;! fram=1
;! fra5=1
;! endif
;! if (diag.gt.1) then
;cord or cord+asthma
;! fdg0=e0diag
;! fdgm=emdiag
;! fdg5=c5diag
;! else
;! fdg0=1
;! fdgm=1
;! fdg5=1
;! endif
;!
fe0=fag0*fwt0*fsx0*fra0*fdg0
;!
femax=fagm*fwtm*fsxm*fram*fdgm
;!
fec50=fag5*fwt5*fsx5*fra5*fdg5
TVE0=FE0*popE0
TVEMAX=FEMAX*popEMAX
TVEC50=FEC50*popEC50
E0=TVE0*EXP(etaE0)
EMAX=TVEMAX*EXP(etaEMAX)
EC50=TVEC50*EXP(etaEC50)
Y = E0 +
EMAX*THEO/(THEO+EC50) + errSD
;$TABLE ID
;FAG0 FWT0 FSX0 FRA0
FDG0
;FAGM FWTM FSXM FRAM
FDGM
;FAG5 FWT5 FSX5 FRA5
FDG5
;NOAPPEND NOPRINT
ONEHEADER FILE=theopd.fit
In this example there are 3 new parameters (one each for effects on e0, emax and EC50) which are added/removed when the model is being built in the forward and elimination phases. The critical values for nmacfwd and nmacback (alpha=0.05 forward, alpha=0.001 backward) are set before running nmac. The following command tests all covariates using this autocovariate model.
set nmacfwd=7.815
set nmacback=16.266
nmac theoe0m5 AGE WT
SEX RACE DIAG
The result of
each run are summarised shown here (taken from nmobj.txt) (Compaq Visual Fortran Version 6.6A):
Run |
Obj |
Eval |
Sig |
5_WT |
5680.907 |
3740 |
4.1 |
4_RACE |
5691.505 |
1501 |
3.2 |
5_0WT |
5691.505 |
1501 |
3.2 |
4_WT |
5695.251 |
1801 |
3.2 |
5_0RACE |
5695.251 |
1801 |
3.2 |
5_0DIAG |
5702.767 |
1393 |
3.4 |
3_DIAG |
5703.414 |
1075 |
3.2 |
5_0AGE |
5708.209 |
1114 |
3 |
5_0SEX |
5709.686 |
5160 |
3.1 |
3_WT |
5712.141 |
1123 |
3 |
3_RACE |
5722.157 |
1059 |
3.7 |
2_SEX |
5725.068 |
1129 |
3.4 |
2_DIAG |
5730.271 |
1030 |
3.2 |
2_WT |
5732.044 |
619 |
3 |
2_RACE |
5734.482 |
798 |
3.3 |
1_AGE |
5742.557 |
561 |
3 |
1_DIAG |
5756.967 |
475 |
4.2 |
1_SEX |
5791.678 |
299 |
3.1 |
1_RACE |
5794.051 |
323 |
3.1 |
1_WT |
5797.118 |
266 |
3.2 |
theoe0m5 |
5802.086 |
113 |
4 |
The runnames
start with a number indicating the number of covariates included in the model
followed by the name of the covariate added on that run. The decision to
include a covariate in the model is based on a statistical criterion. In this
case, a change in objective function of 7.815 units or greater. With one
covariate included the best run was 1_AGE. On the next set of runs AGE is
included in all runs. At the end of the 2 covariate runs the best run was
2_SEX; the third set of runs includes 3_DIAG; the fourth set of runs led to the
inclusion of 4_RACE and finally the fifth set of runs adds 5_WT. This concludes
the forward inclusion phase i.e. when adding a covariate does not decrease the
objective function by more than 7.815 units. In this case it also happens to be
the last covariate run that could be tried. The fifth run is known as the full
model because it includes all the covariates that each improved the objective
function by at least 7.815.
The next phase is
called backward elimination. Each of the covariates in the full model is
removed one at a time with replacement. The statistical criterion for backward
elimination is a change in objective function of 16.266. In the current example
these are runs 5_0RACE, 5_0DIAG, 5_0AGE and 5_0SEX. Removal of WT is equivalent
to 4_RACE. The "0" before the covariate name indicates this
covariate was removed in that run. The final model includes SEX, AGE, DIAG. WT
and RACE did not meet the backward exclusion criterion for the final model.
This is summarized in the nmac.txt
file at the end of the autocovariate model series of runs. The upper section
shows the change in objective function (Delta) as each additional covariate
which meets the statistical criterion is included. The covariates retained in
the final model are shown in the lower section. The change in objective
function when each of these final model covariates is excluded from the final
model is shown in the Delta column.
NMACFWD |
7.815 |
NMACBACK |
16.266 |
Forward |
Delta |
Included |
|
1_AGE |
59.529 |
AGE |
|
2_SEX |
17.489 |
AGE
SEX |
|
3_DIAG |
21.654 |
AGE SEX DIAG |
|
4_RACE |
11.909 |
AGE SEX DIAG RACE |
|
5_WT |
10.598 |
AGE SEX DIAG RACE WT |
|
Backward |
Delta |
Included |
|
5_0DIAG |
21.86 |
DIAG |
|
5_0AGE |
27.302 |
AGE |
|
5_0SEX |
28.779 |
SEX |
Note that WT is
included in this model as a covariate simultaneously influencing E0, Emax and
EC50. As stated above the inclusion of WT as a covariate
to predict EC50 is not reasonable and I would not advocate concluding that the
full model developed using nmac should be accepted as it stands.
IMPORTANT: It is essential to run only one nmac model in a single directory. nmac will easily get confused if you keep the results of runs using different models in the same directory. You should not have any WFN results directories in the current directory when you start nmac.
Home | Installation | Control Streams | Bootstrap | Randomization Test | Visual Predictive Check | Autocovariate | Files | References