Automatic Model Development (AMD)#
The AMD tool is a general tool for fully automatic model development to decide the best model given either a dataset or a starting model. The tool is a combination of the following tools: Modelsearch, Structsearch, IIVsearch, IOVsearch, ruvsearch, allometry, and COVsearch.
On this page, general information regarding the AMD workflow can be found.
Supported model types#
AMD currently supports a few different model types. For specific information regarding a given model type, please see their respective page.
Running#
The AMD tool is available both in Pharmpy/pharmr.
To initiate AMD in Python/R:
from pharmpy.tools import run_amd
res = run_amd(input='path/to/dataset',
modeltype='basic_pk',
administration='oral',
cl_init=2.0,
vc_init=5.0,
mat_init=3.0,
strategy='default',
search_space='LET(CATEGORICAL, [SEX]); LET(CONTINUOUS, [AGE])',
allometric_variable='WGT',
occasion='VISI'
)
res <- run_amd(input='path/to/dataset',
modeltype='basic_pk',
administration='oral',
cl_init=2.0,
vc_init=5.0,
mat_init=3.0,
strategy='default',
search_space='LET(CATEGORICAL, [SEX]); LET(CONTINUOUS, [AGE])',
allometric_variable='WGT',
occasion='VISI'
)
This will take a dataset as input
, where the modeltype
has been specified to be a PK model and administration
is oral. AMD will search
for the best structural model, IIV structure, and residual error model in the order specified by strategy
(see strategy). We specify the column SEX
as a categorical
covariate and AGE as a continuous
covariate. Finally, we declare the WGT-column as our
allometric_variable
, VISI as our occasion
column.
Arguments#
Model type specific arguments#
Argument |
Description |
---|---|
|
Path to a dataset or start model object. See Input |
|
ModelfitResults if input is a model |
|
Type of model to build (e.g. ‘basic_pk’, ‘pkpd’, ‘drug_metabolite’ or ‘tmdd’). Default is ‘basic_pk’ |
|
Route of administration. One of ‘iv’, ‘oral’ or ‘ivoral’. Default is ‘oral’ |
|
Initial estimate for the population clearance |
|
Initial estimate for the central compartment population volume |
|
Initial estimate for the mean absorption time (only for oral models) |
|
Initial estimate for the baseline effect (only for PKPD models) |
|
Initial estimate for the Emax (only for PKPD models) |
|
Initial estimate for the EC50 (only for PKPD models) |
|
Initial estimate for the mean equilibration time (only for PKPD models) |
|
Dictionary of DV types for multiple DVs (e.g. |
General arguments#
Argument |
Description |
---|---|
|
Strategy defining run order of the different subtools. Default is ‘default’ |
|
MFL for search space of structural and covariate models
(default depends on |
|
Lower limit of quantification. |
|
Method to use for handling lower limit of quantification. See |
|
Variable to use for allometry (default is name of column described as body weight) |
|
Name of occasion column |
|
Strictness criteria for model selection.
Default is “minimization_successful or
(rounding_errors and sigdigs>= 0.1)”
If |
|
List of covariates or covariate/parameter combinations to run in a separate prioritized covsearch run. Allowed elements in the list are strings of covariates or tuples with one covariate and parameter each, e.g [“AGE”, (“WGT”, “CL”)]. The associated effects are extracted from the given search space. |
|
Decide how to use the retries tool. Valid options are ‘skip’, ‘all_final’ or ‘final’. Default is ‘all_final’ |
|
A random number generator or seed to use for steps with random sampling. |
|
Parameter uncertainty method to use. Currently implemented methods are: ‘SANDWICH’, ‘SMAT’, ‘RMAT’ and ‘EFIM’.
For more information about these methods see
|
|
Decide wether or not to use connected datainfo object to infer information about the model. If True, all information regarding the model must be given explicitly by the user, such as the allometric varible. If False, such information is extracted using the datainfo, in the absence of arguments given by the user. Default is False. |
Input#
The AMD tool can use both a dataset and a model as input. If the input is a dataset (with corresponding datainfo file), Pharmpy will create a model with the following attributes:
Structural: one compartment, first order absorption (if
administration
is'oral'
), first order eliminationIIV: CL and VC with covariance (
'iv'
) or CL and VC with covariance and MAT ('oral'
)Residual: proportional error model
Estimation steps: FOCE with interaction
If the input is a model, the model needs to be a PK model.
When running the tool with administration ‘ivoral’ with a dataset as input, the dataset is required to have a CMT column with values 1 (oral doses) and 2 (IV doses). This is required for the creation of the initial one-compartment model with first order absorption. In order to easily differentiate the two doses, an administration ID (ADMID) column will be added to the data as well. This will be used in order to differentiate the different doses from one another with respect to the applied error model. If a model is used as input instead, this is not applied as it is assumed to have the correct CMT values for the connected model, along with a way of differentiating the doses from one another.
Search space#
A search space can be used to define all possible (and allowed) combinations of model features when searching for a model. Currently, the search space support both structural as well as covariate models. All features are given in the same MFL string. The different search spaces are then extracted from there and have no effect on one another.
If no search space is given for either the structural or covariate modeling, a default search space will be applied. This will be based on the model type as well as administration. Please check the respective model type page to get information on what is used for the specific model type/administration combination.
Note
Please see the description of Model feature language (MFL) for how to define the search space for the structural and covariate models.
Example#
For a PK oral model, the default is:
ABSORPTION([FO,ZO,SEQ-ZO-FO])
ELIMINATION(FO)
LAGTIME([OFF,ON])
TRANSITS([0,1,3,10],*)
PERIPHERALS([0,1])
COVARIATE?(@IIV, @CONTINUOUS, *)
COVARIATE?(@IIV, @CATEGORICAL, CAT)
Note that defaults are overriden selectively: structural model features
defaults will be ignored as soon as one structural model feature is explicitly
given, but the covariate model defaults will stay in place, and vice versa. For
instance, if one defines search_space
as LAGTIME(ON)
, the effective
search space will be as follows:
LAGTIME(ON)
COVARIATE?(@IIV, @CONTINUOUS, *)
COVARIATE?(@IIV, @CATEGORICAL, CAT)
Strategy for running AMD#
There are different strategies available for running the AMD tool which is specified
in the strategy
argument. They all use a combination of the different subtools
described below and will be described below. As all tools might not be applicable to
all model types, the used subtools in the different steps is dependent on the
modeltype
argument. Please see the description for each tool described below
for more details.
Only a single strategy can be used for each AMD run. Combinations of strategies are not supported. However each of the subtools used in AMD is available to use manually as well.
Note
Please note that the following is a general description of the different components executed by the AMD tool. Please see corresponding model type page for a detailed outline on how the different components are run.
default (default)#
If no argument is specified, ‘default’ will be used as the default strategy. This will use
all tools available for the specified modeltype
. The exact workflow can hence differ for
the different model type but a general visualization of this can be seen below:
reevaluation#
The reevaluation strategy is an extension of the ‘default’ strategy. It is defined by the re-running of IIVsearch and RUVsearch. This indicate that the tool follow the exact same principles and the workflow hence is dependent on the model type in question.
The general order of subtools hence become:
SIR#
This strategy is related to ‘SRI’ and ‘RSI’ and is an acronym for running the Structural, IIVsearch and RUVsearch steps of the AMD tool. The workflow hence become as follows:
SRI#
This strategy is related to ‘SIR’ and ‘RSI’ and is an acronym for running the Structural, RUVsearch and IIVsearch steps of the AMD tool. The workflow hence become as follows:
RSI#
This strategy is related to ‘SIR’ and ‘SRI’ and is an acronym for running the RUVsearch, Structural and IIVsearch steps of the AMD tool. The workflow hence become as follows:
Retries#
The retries_strategy
argument determines whether the retries tool will be used or not, and
on which models. The different options and their description can be seen below. See Retries
for more details about the tool.
Strategy |
Description |
---|---|
|
Retries tool run on final model from each tool. |
|
Retries tool run only on final model from complete AMD workflow |
|
No retries are run for any models |
When running the tool from AMD, the settings below will be used. If argument seed
is set,
the chosen seed or random number generator will be used for the random sampling within the tool.
Argument |
Setting |
---|---|
|
5 |
|
0.1 |
|
‘UCP’ |
|
False |
|
The name of the previously run tool |
|
|
Strategy components#
The subtools that are run in each step, along with their respective arguments, are dependent on the model type given. Below follows a general description of each of the steps. As different model types can perform the the same step differently, please see the specific model type page for more details.
Structural#
This component of the AMD run is usually found in the beginning of a strategy and aims to find the best structural model for the specified model type. Oftentimes including structural covariates along with the structure of the compartment system.
Structural covariates are user defined covariate effects that are not tested but rather forcefully added to the input model. These effects are given within the search space in the following way:
COVARIATE(CL, WGT, POW)
COVARIATE?(@IIV, @CATEGORICAL, *)
In this search space, the power covariate effect of WGT on CL is interpreted as a structural covariate (due to the missing “?”) while the other statement is interpreted as an exploratory covariate effect and will be explored in a later covsearch run.
The structural component of an AMD run is heavily dependent on which model type is being analyzed. It is possible that both Modelsearch and Structsearch are used when developing the structural model, e.g. for TMDD and drug metabolite models, Modelsearch will develop the PK structural model and Structsearch will develop the TMDD / drug metabolite model
IIVsearch#
This subtool selects the IIV structure. The tool will find both the number of IIVs and their covariance structure. See IIVsearch for more details about the tool.
Which IIVs that are being added is dependent on the model type. For example, for PKPD models, IIVs are only added to the PD parameters within the model.
Residual#
This subtool selects the residual error model connected to the model. See ruvsearch for more details about the tool.
IOVsearch#
This subtool selects the IOV structure and tries to remove corresponding IIVs if possible, see IOVsearch for
more details about the tool. If no argument for occasion
is given, this tool will not be run.
Allometry#
This subtool applies allometry to clearance and volume parameters of the inputted model.
Note
Please note that if ignore_datainfo_fallback
is set to True
and no allometric variable is given, this tool
will not be run. See allometry for more details about the tool.
Covariates#
This subtool selects which covariate effects to apply, see COVsearch for more details about the tool.
Covariate effects for this stage are specified in the search space by specifying the effect with a “?”, as the following example suggests:
COVARIATE?(@IIV, @CATEGORICAL, *)
Covariate effects are split into two types at this stage. Mechanistic as well as exploratory covariate effects. Both
are to be tested for the model, but the mechanistic covariate effects will be tested in a separate initial covsearch run.
These covariates are specified using the mechanistic_covariates
argument.
Given the mechanistic covariates mechanistic_covariates = [AGE, (CL,WGT)]
, the following search space would be
evaluated accordingly:
COVARIATE?([CL,V], [AGE, WGT], *)
COVARIATE?(Q, WGT, *)
COVARIATE?([CL,V], AGE, *)
COVARIATE?(CL, WGT, *)
COVARIATE?([V,Q], WGT, *)
Note
Please note that if ignore_datainfo_fallback
is set to True
and no covariates are given, this tool will not be run.
The search space of effects given to this tool should include all possible (and allowed) covariate effects for the resulting model. This means that covariate effects that are a part of the input model but not the given search space will be removed.
Note
As allometric scaling can be interpreted as a power covariate effect, these effects will be added to the search space to avoid removing them during a covsearch run, if allometry was a part of the strategy.
Results#
The results object contains the final selected model and various summary tables, all of which can be accessed in the results object as well as files in .csv/.json format.
The summary_tool
table contains information such as which feature each model candidate has, the difference to the
start model (in this case comparing BIC), and final ranking:
selected_model | description | ofv | dofv | n_params | d_params | |
---|---|---|---|---|---|---|
tool | ||||||
start | start | Start model | NaN | 0.000000 | 8 | 0 |
modelsearch | modelsearch_run7 | PERIPHERALS(1);TRANSITS(3, DEPOT) | -2364.853749 | NaN | 12 | 4 |
iivsearch | iivsearch_run34 | [VC]+[CL,MDT] | -2397.309843 | 32.456094 | 11 | -1 |
ruvsearch | best_ruvsearch_2 | time_varying2+IIV_on_RUV | -2496.804117 | 99.494274 | 13 | 2 |
iovsearch | iovsearch_run6 | IIV([VC]+[CL,MDT]);IOV([CL]) | -2619.709509 | 122.905392 | 14 | 1 |
allometry | scaled_model | Allometry model | -2645.684455 | 25.974946 | 14 | 0 |
covsearch | scaled_model | Allometry model | -2645.684455 | 0.000000 | 14 | 0 |
retries | scaled_model | Allometry model | -2645.684455 | 0.000000 | 14 | 0 |
To see information about the actual model runs, such as minimization status, estimation time, and parameter estimates,
you can look at the summary_models
table. The table is generated with
pharmpy.modeling.summarize_modelfit_results()
.
description | minimization_successful | errors_found | warnings_found | ofv | runtime_total | estimation_runtime | POP_CL_estimate | POP_CL_SE | POP_CL_RSE | ... | POP_VCAGE_RSE | POP_CLSEX_estimate | POP_CLSEX_SE | POP_CLSEX_RSE | POP_MDTSEX_estimate | POP_MDTSEX_SE | POP_MDTSEX_RSE | POP_VCSEX_estimate | POP_VCSEX_SE | POP_VCSEX_RSE | |||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
tool | step | model | |||||||||||||||||||||
modelsearch | 0 | start | Start model | False | 1 | 0 | NaN | 7.0 | 1.40 | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
1 | modelsearch_run1 | LAGTIME(ON) | True | 0 | 1 | -2272.860804 | 35.0 | 24.72 | 25.63140 | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | |
modelsearch_run2 | PERIPHERALS(1) | True | 0 | 1 | -2091.335594 | 22.0 | 12.91 | 24.36500 | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ||
modelsearch_run3 | TRANSITS(3, DEPOT) | True | 0 | 1 | -2091.332746 | 516.0 | 505.27 | 24.36780 | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ||
modelsearch_run4 | TRANSITS(5, DEPOT) | True | 0 | 1 | -2091.332840 | 1089.0 | 1078.46 | 24.37230 | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ||
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
retries | 1 | retries_run1 | Allometry model | False | 2 | 1 | -2032.002236 | 2700.0 | 2687.82 | 1.91807 | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
retries_run2 | Allometry model | False | 2 | 1 | -2530.976558 | 1663.0 | 1651.66 | 24.09260 | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ||
retries_run3 | Allometry model | False | 2 | 1 | -2476.102607 | 2178.0 | 2166.09 | 23.88190 | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ||
retries_run4 | Allometry model | False | 2 | 0 | -2600.287460 | 3740.0 | 3728.76 | 24.26100 | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ||
retries_run5 | Allometry model | False | 2 | 1 | -2123.583364 | 2210.0 | 2196.82 | 26.26170 | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
76 rows × 127 columns
Finally, you can see a summary of any errors and warnings of the final selected model in summary_errors
.
See pharmpy.tools.summarize_errors()
for information on the content of this table.
time | message | |||
---|---|---|---|---|
model | category | error_no |
Final model#
Some plots and tables on the final model can be found both in the amd report and in the results object.
estimates | RSE | |
---|---|---|
POP_CL | 24.0637 | nan% |
POP_VC | 22.6158 | nan% |
POP_MAT | 0.0313 | nan% |
POP_QP1 | 108.3610 | nan% |
POP_VP1 | 72.2972 | nan% |
POP_MDT | 0.9278 | nan% |
time_varying | 2.2106 | nan% |
IIV_VC | 1.1818 | nan% |
IIV_CL | 0.2668 | nan% |
IIV_CL_IIV_MDT | 0.1068 | nan% |
IIV_MDT | 0.3711 | nan% |
IIV_RUV1 | 0.1777 | nan% |
OMEGA_IOV_2 | 0.1395 | nan% |
sigma | 0.1436 | nan% |
Examples#
TMDD#
Run AMD for a TMDD model:
from pharmpy.modeling import read_model
from pharmpy.tools read_modelfit_results, run_structsearch
start_model = read_model('path/to/model')
res = run_amd(
modeltype='tmdd',
input=start_model,
search_space='PERIPHERALS([1,2]);ELIMINATION([FO,ZO])',
dv_types={'drug': 1, 'target': 2, 'complex': 3}
)
from pharmpy$tools read_modelfit_results, run_structsearch
start_model <- read_model('path/to/model')
res <- run_amd(
modeltype='tmdd',
input=start_model,
search_space='PERIPHERALS([1,2]);ELIMINATION([FO,ZO])',
dv_types=list('drug'=1, 'target'=2, 'complex'=3)
)
Note
The name of the DVID column must be “DVID”.
PKPD#
Run AMD for a PKPD model:
from pharmpy.modeling import read_model
from pharmpy.tools read_modelfit_results, run_structsearch
start_model = read_model('path/to/model')
res = run_amd(
modeltype='pkpd',
input=start_model,
search_space='DIRECTEFFECT(*)',
b_init=0.1,
emax_init=1,
ec50_init=0.1,
met_init=0.4,
)
from pharmpy$tools read_modelfit_results, run_structsearch
start_model <- read_model('path/to/model')
res <- run_amd(
modeltype='pkpd',
input=start_model,
search_space='DIRECTEFFECT(*)',
b_init=0.1,
emax_init=1,
ec50_init=0.1,
met_init=0.4,
)
Note
The input model must be a PK model with a PKPD dataset. The name of the DVID column must be “DVID”.