Automatic Search for Covariate Model

 

Home | Installation | Control Streams | Bootstrap | Randomization Test | Visual Predictive Check | Autocovariate | Files | References

 

Last Updated: 3 April 2012

A common task for NONMEM users is to discover which covariates are influential in predicting the main model parameters such as clearance or EC50. Automated approaches to this problem have been suggested (Mandema, Verotta & Sheiner 1992) in the literature. Jonsson & Karlsson (1997) have implemented this method in xpose2. Both of these methods rely on the use of post hoc parameter estimates from a single baseline run without covariates.  Niclas Jonsson has implemented an automated way of searching for covariate models which involves a full NONMEM run at each step (gam42). The autocovariate model search method implemented in WFN is based heavily on the ideas used in gam42 and I appreciate the discussions with Niclas and Mats on this topic.

CAUTION: There has been well founded criticism of automated stepwise selection procedures used for model building (e.g. Wright 2001 ) and it should not be assumed that covariate models developed using nmac or other automated procedures will find models with good predictive abilities. Covariate models should only be seriously considered if you have a reasonable a priori mechanistic/biological reason for accepting them in the overall model e.g. weight would always seem to be a reasonable covariate for explaining between subject differences in clearance but it would require some very special rationale to ever consider weight as an explanatory covariate for EC50.

 

Model Building Criteria

The criteria for forward inclusion and backward elimination from the model are defined by the environment variables nmacfwd and nmacback. The default values are nmacfwd=3.841 (alpha=0.05, one df i.e. one parameter added) and nmacback=10.828 (alpha=0.001, one df i.e. one parameter removed).
 

Autocovariate Records

Autocovariate model records are identified by ";!" in the first two characters of the record. All covariate models must be specified in a conditional block and the logical expression must include the covariate data item name. More than one main model parameter may be included in each covariate conditional block.

The covariate effect variables must appear in a single autocovariate model record after the covariate model conditional blocks e.g.

 

;! fe0=fage
 

 

NMAC Examples

 

1.Age as a Covariate on E0

The following code  (theoe0.ctl) illustrates an autocovariate control stream using the theopd.dat data set and theopd.ctl as the base model.   Those THETAs which are parameters of covariate models must have parameter names which include the covariate data item name e.g. the 4th THETA below has the name eoage which is therefore linked with the age covariate model. An empirical exponential model [fage=exp(e0age*(age-40))] is used to define the fractional effect of AGE on E0 centered on a standard individual with age=40.

 

$PROB Theophylline pharmacodynamics
$DATA ..\..\..\theopd.dat IGNORE # ; path points to theopd.dat in wfn\run directory
$INPUT ID TIME THEO AGE WT SEX RACE DIAG PEFR=DV
$ESTIM MAX=9990

$THETA (0,146.,) ; pope0
$THETA (0,165.,) ; popemax
$THETA (.001,6.57,) ; popec50
$THETA 0. FIX ; e0age

$OMEGA 0.00178 ; etae0
$OMEGA 0.239 ; etaemax
$OMEGA 1.46 ; etaec50

$SIGMA 6590. ; errsd

$PRED

fe0=1 ; must include a default value for each covariate effect variable

;! if (age.gt.0) then
;! fage=exp(e0age*(age-40))
;! else
;! fage=1
;! endif
;! fe0=fage

TVE0=fe0*popE0
E0=TVE0*EXP(etaE0)
EMAX=popEMAX*EXP(etaEMAX)
EC50=popEC50*EXP(etaEC50)

Y = E0 + EMAX*THEO/(THEO+EC50) + errSD
 

2.Age and Sex as Covariates on E0

The effects of age and sex are shown in the theoe02.ctl control stream. This example shows how covariates can be combined using a multiplicative effect of AGE and SEX on E0. It also shows how a specific initial value for the covariate parameter can be defined by enclosing it in square brackets. The initial value of eoage will be -0.01. If no explicit value is given then the default value for a parameter with a FIXed value of 0 will be 0.01.

 

$PROB Theophylline pharmacodynamics autocovariate
$DATA ..\..\..\theopd.dat IGNORE #
$INPUT ID TIME THEO AGE WT SEX RACE DIAG PEFR=DV
$ESTIM MAX=9990

$THETA (0,146.,) ; pope0
$THETA (0,165.,) ; popemax
$THETA (.001,6.57,) ; popec50
$THETA 0. FIX ; e0age [-.01]
$THETA 1. FIX ; e0sex

$OMEGA 0.00178 ; etae0
$OMEGA 0.239 ; etaemax
$OMEGA 1.46 ; etaec50

$SIGMA 6590. ; errsd


$PRED

FE0=1

;! if (age.gt.0) then
;! fage=exp(e0age*(age-40))
;! else
;! fage=1
;! endif
;! if (sex.eq.0) then ;female
;! fsex=e0sex
;! else
;! fsex=1
;! endif
;! fe0=fage*fsex

TVE0=fe0*popE0
E0=TVE0*EXP(etaE0)
EMAX=popEMAX*EXP(etaEMAX)
EC50=popEC50*EXP(etaEC50)

Y = E0 + EMAX*THEO/(THEO+EC50) + errSD
 

3.Five Covariates on E0, Emax and EC50

This is the ( theoe0m5.ctl) autocovariate model extended control stream which illustrates how automatic covariate models can be built for each of the main model parameters, e0, emax and EC50.

$PROB theophylline pharmacodynamics autocovariate control stream
$DATA theopd.dat IGNORE #
$INPUT ID TIME THEO AGE WT SEX RACE DIAG PEFR=DV
$ESTIM MAX=9990
$THETA (0,146.,) ; pope0
$THETA (0,165.,) ; popemax
$THETA (.001,6.57,) ; popec50
$THETA 0. FIX ; e0age
$THETA 0. FIX ; e0wt
$THETA 1. FIX ; e0sex
$THETA 1. FIX ; e0race
$THETA 1. FIX ; e0diag
$THETA 0. FIX ; emage
$THETA 0. FIX ; emwt
$THETA 1. FIX ; emsex
$THETA 1. FIX ; emrace
$THETA 1. FIX ; emdiag
$THETA 0. FIX ; c5age
$THETA 0. FIX ; c5wt
$THETA 1. FIX ; c5sex
$THETA 1. FIX ; c5race
$THETA 1. FIX ; c5diag

$OMEGA 0.00178 ; etae0
$OMEGA 0.239 ; etaemax
$OMEGA 1.46 ; etaec50

$SIGMA 6590. ; errsd

$PRED
FE0=1
FEMAX=1
FEC50=1
;! if (age.gt.0) then
;! fag0=exp(e0age*(age-40))
;! fagm=exp(emage*(age-40))
;! fag5=exp(c5age*(age-40))
;! else
;! fag0=1
;! fagm=1
;! fag5=1
;! endif
;! if (wt.gt.0) then
;! fwt0=exp(e0wt*(wt-70))
;! fwtm=exp(emwt*(wt-70))
;! fwt5=exp(c5wt*(wt-70))
;! else
;! fwt0=1
;! fwtm=1
;! fwt5=1
;! endif
;! if (sex.eq.0) then ;female
;! fsx0=e0sex
;! fsxm=emsex
;! fsx5=c5sex
;! else
;! fsx0=1
;! fsxm=1
;! fsx5=1
;! endif
;! if (race.eq.2) then ;Polynesian
;! fra0=e0race
;! fram=emrace
;! fra5=c5race
;! else
;! fra0=1
;! fram=1
;! fra5=1
;! endif
;! if (diag.gt.1) then ;cord or cord+asthma
;! fdg0=e0diag
;! fdgm=emdiag
;! fdg5=c5diag
;! else
;! fdg0=1
;! fdgm=1
;! fdg5=1
;! endif
;! fe0=fag0*fwt0*fsx0*fra0*fdg0
;! femax=fagm*fwtm*fsxm*fram*fdgm
;! fec50=fag5*fwt5*fsx5*fra5*fdg5
 

TVE0=FE0*popE0
TVEMAX=FEMAX*popEMAX
TVEC50=FEC50*popEC50
E0=TVE0*EXP(etaE0)
EMAX=TVEMAX*EXP(etaEMAX)
EC50=TVEC50*EXP(etaEC50)
Y = E0 + EMAX*THEO/(THEO+EC50) + errSD

;$TABLE ID
;FAG0 FWT0 FSX0 FRA0 FDG0
;FAGM FWTM FSXM FRAM FDGM
;FAG5 FWT5 FSX5 FRA5 FDG5
;NOAPPEND NOPRINT ONEHEADER FILE=theopd.fit

In this example there are 3 new parameters (one each for effects on e0, emax and EC50) which are added/removed when the model is being built in the forward and elimination phases. The critical values for nmacfwd and nmacback (alpha=0.05 forward, alpha=0.001 backward) are set before running nmac. The following command tests all covariates using this autocovariate model.

set nmacfwd=7.815
set nmacback=16.266
nmac theoe0m5 AGE WT SEX RACE DIAG

The result of each run are summarised shown here (taken from nmobj.txt) (Compaq Visual Fortran Version 6.6A):
 

 

Run     

Obj     

Eval

Sig

5_WT    

5680.907

3740

4.1

4_RACE  

5691.505

1501

3.2

5_0WT   

5691.505

1501

3.2

4_WT    

5695.251

1801

3.2

5_0RACE 

5695.251

1801

3.2

5_0DIAG 

5702.767

1393

3.4

3_DIAG  

5703.414

1075

3.2

5_0AGE  

5708.209

1114

3

5_0SEX  

5709.686

5160

3.1

3_WT    

5712.141

1123

3

3_RACE  

5722.157

1059

3.7

2_SEX   

5725.068

1129

3.4

2_DIAG  

5730.271

1030

3.2

2_WT    

5732.044

619

3

2_RACE  

5734.482

798

3.3

1_AGE   

5742.557

561

3

1_DIAG  

5756.967

475

4.2

1_SEX   

5791.678

299

3.1

1_RACE  

5794.051

323

3.1

1_WT    

5797.118

266

3.2

theoe0m5

5802.086

113

4

The runnames start with a number indicating the number of covariates included in the model followed by the name of the covariate added on that run. The decision to include a covariate in the model is based on a statistical criterion. In this case, a change in objective function of 7.815 units or greater. With one covariate included the best run was 1_AGE. On the next set of runs AGE is included in all runs. At the end of the 2 covariate runs the best run was 2_SEX; the third set of runs includes 3_DIAG; the fourth set of runs led to the inclusion of 4_RACE and finally the fifth set of runs adds 5_WT. This concludes the forward inclusion phase i.e. when adding a covariate does not decrease the objective function by more than 7.815 units. In this case it also happens to be the last covariate run that could be tried. The fifth run is known as the full model because it includes all the covariates that each improved the objective function by at least 7.815.
 

The next phase is called backward elimination. Each of the covariates in the full model is removed one at a time with replacement. The statistical criterion for backward elimination is a change in objective function of 16.266. In the current example these are runs 5_0RACE, 5_0DIAG, 5_0AGE and 5_0SEX. Removal of WT is equivalent to 4_RACE.  The "0" before the covariate name indicates this covariate was removed in that run. The final model includes SEX, AGE, DIAG. WT and RACE did not meet the backward exclusion criterion for the final model. This is summarized in the nmac.txt file at the end of the autocovariate model series of runs. The upper section shows the change in objective function (Delta) as each additional covariate which meets the statistical criterion is included. The covariates retained in the final model are shown in the lower section. The change in objective function when each of these final model covariates is excluded from the final model is shown in the Delta column.
 

 

NMACFWD

7.815

NMACBACK

16.266

Forward

Delta

Included

1_AGE

59.529

AGE

2_SEX

17.489

AGE SEX

3_DIAG

21.654

AGE SEX DIAG

4_RACE

11.909

AGE SEX DIAG RACE

5_WT

10.598

AGE SEX DIAG RACE WT

Backward

Delta

Included

5_0DIAG 

21.86

DIAG

5_0AGE

27.302

AGE

5_0SEX

28.779

SEX


Note that WT is included in this model as a covariate simultaneously influencing E0, Emax and EC50. As stated above the inclusion of WT as a covariate to predict EC50 is not reasonable and I would not advocate concluding that the full model developed using nmac should be accepted as it stands.

IMPORTANT: It is essential to run only one nmac model in a single directory.  nmac will easily get confused if you keep the results of runs using different models in the same directory. You should not have any WFN results directories in the current directory when you start nmac.

Home | Installation | Control Streams | Bootstrap | Randomization Test | Visual Predictive Check | Autocovariate | Files | References