Automatic Model Development (AMD)#

The AMD tool is a general tool for fully automatic model development to decide the best model given either a dataset or a starting model. The tool is a combination of the following tools: Modelsearch, Structsearch, IIVsearch, IOVsearch, ruvsearch, allometry, and COVsearch.

On this page, general information regarding the AMD workflow can be found.

Supported model types#

AMD currently supports a few different model types. For specific information regarding a given model type, please see their respective page.

Running#

The AMD tool is available both in Pharmpy/pharmr.

To initiate AMD in Python/R:

from pharmpy.tools import run_amd

search_space = 'COVARIATE?(@IIV,SEX,EXP);COVARIATE?(@IIV,AGE,CAT)'
res = run_amd(input='path/to/dataset',
              modeltype='basic_pk',
              administration='oral',
              cl_init=2.0,
              vc_init=5.0,
              mat_init=3.0,
              strategy='default',
              search_space=search_space,
              allometric_variable='WGT',
              occasion='VISI'
)

search_space <- 'COVARIATE?(@IIV,SEX,EXP);COVARIATE?(@IIV,AGE,CAT)'
res <- run_amd(input='path/to/dataset',
              modeltype='basic_pk',
              administration='oral',
              cl_init=2.0,
              vc_init=5.0,
              mat_init=3.0,
              strategy='default',
              search_space=search_space,
              allometric_variable='WGT',
              occasion='VISI'
)

This will take a dataset as input, where the modeltype has been specified to be a PK model and administration is oral. AMD will search for the best structural model, IIV structure, and residual error model in the order specified by strategy (see strategy). We specify the column SEX as a categorical covariate and AGE as a continuous covariate. Finally, we declare the WGT-column as our allometric_variable, VISI as our occasion column.

Arguments#

Model type specific arguments#

Argument	Description
`input`	Path to a dataset or start model object. See Input
`results`	ModelfitResults if input is a model
`modeltype`	Type of model to build (e.g. ‘basic_pk’, ‘pkpd’, ‘drug_metabolite’ or ‘tmdd’). Default is ‘basic_pk’
`administration`	Route of administration. One of ‘iv’, ‘oral’ or ‘ivoral’. Default is ‘oral’
`cl_init`	Initial estimate for the population clearance
`vc_init`	Initial estimate for the central compartment population volume
`mat_init`	Initial estimate for the mean absorption time (only for oral models)
`b_init`	Initial estimate for the baseline effect (only for PKPD models)
`emax_init`	Initial estimate for the Emax (only for PKPD models)
`ec50_init`	Initial estimate for the EC50 (only for PKPD models)
`met_init`	Initial estimate for the mean equilibration time (only for PKPD models)
`dv_types`	Dictionary of DV types for multiple DVs (e.g. `dv_types = {'target': 2})`. Default is None. (For TMDD models only)

General arguments#

Argument	Description
`strategy`	Strategy defining run order of the different subtools. Default is ‘default’
`search_space`	MFL for search space of structural and covariate models (default depends on `modeltype` and `administration`)
`lloq_limit`	Lower limit of quantification.
`lloq_method`	Method to use for handling lower limit of quantification. See `pharmpy.modeling.transform_blq()`.
`allometric_variable`	Variable to use for allometry (default is name of column described as body weight)
`occasion`	Name of occasion column
`strictness`	Strictness criteria for model selection. Default is “minimization_successful or (rounding_errors and sigdigs>= 0.1)” If `strictness` is set to `None` no strictness criteria are applied
`mechanistic_covariates`	List of covariates or covariate/parameter combinations to run in a separate prioritized covsearch run. Allowed elements in the list are strings of covariates or tuples with one covariate and parameter each, e.g [“AGE”, (“WGT”, “CL”)]. The associated effects are extracted from the given search space.
`retries_strategy`	Decide how to use the retries tool. Valid options are ‘skip’, ‘all_final’ or ‘final’. Default is ‘all_final’
`seed`	A random number generator or seed to use for steps with random sampling.
`parameter_uncertainty_method`	Parameter uncertainty method to use. Currently implemented methods are: ‘SANDWICH’, ‘SMAT’, ‘RMAT’ and ‘EFIM’. For more information about these methods see `here`.
`ignore_datainfo_fallback`	Decide whether or not to use connected datainfo object to infer information about the model. If True, all information regarding the model must be given explicitly by the user, such as the allometric variable. If False, such information is extracted using the datainfo, in the absence of arguments given by the user. Default is False.

Input#

The AMD tool can use both a dataset and a model as input. If the input is a dataset (with corresponding datainfo file), Pharmpy will create a model with the following attributes:

Structural: one compartment, first order absorption (if administration is 'oral'), first order elimination
IIV: CL and VC with covariance ('iv') or CL and VC with covariance and MAT ('oral')
Residual: proportional error model
Estimation steps: FOCE with interaction

If the input is a model, the model needs to be a PK model.

When running the tool with administration ‘ivoral’ with a dataset as input, the dataset is required to have a CMT column with values 1 (oral doses) and 2 (IV doses). This is required for the creation of the initial one-compartment model with first order absorption. In order to easily differentiate the two doses, an administration ID (ADMID) column will be added to the data as well. This will be used in order to differentiate the different doses from one another with respect to the applied error model. If a model is used as input instead, this is not applied as it is assumed to have the correct CMT values for the connected model, along with a way of differentiating the doses from one another.

Search space#

A search space can be used to define all possible (and allowed) combinations of model features when searching for a model. Currently, the search space support both structural as well as covariate models. All features are given in the same MFL string. The different search spaces are then extracted from there and have no effect on one another.

If no search space is given for either the structural or covariate modeling, a default search space will be applied. This will be based on the model type as well as administration. Please check the respective model type page to get information on what is used for the specific model type/administration combination.

Note

Please see the description of Model feature language (MFL) for how to define the search space for the structural and covariate models.

Example#

For a PK oral model, the default is:

ABSORPTION([FO,ZO,SEQ-ZO-FO])
ELIMINATION(FO)
LAGTIME([OFF,ON])
TRANSITS([0,1,3,10],*)
PERIPHERALS([0,1])
COVARIATE?(@IIV, @CONTINUOUS, *)
COVARIATE?(@IIV, @CATEGORICAL, CAT)

Note that defaults are overridden selectively: structural model features defaults will be ignored as soon as one structural model feature is explicitly given, but the covariate model defaults will stay in place, and vice versa. For instance, if one defines search_space as LAGTIME(ON), the effective search space will be as follows:

LAGTIME(ON)
COVARIATE?(@IIV, @CONTINUOUS, *)
COVARIATE?(@IIV, @CATEGORICAL, CAT)

Strategy for running AMD#

There are different strategies available for running the AMD tool which is specified in the strategy argument. They all use a combination of the different subtools described below and will be described below. As all tools might not be applicable to all model types, the used subtools in the different steps is dependent on the modeltype argument. Please see the description for each tool described below for more details.

Only a single strategy can be used for each AMD run. Combinations of strategies are not supported. However each of the subtools used in AMD is available to use manually as well.

Note

Please note that the following is a general description of the different components executed by the AMD tool. Please see corresponding model type page for a detailed outline on how the different components are run.

default (default)#

If no argument is specified, ‘default’ will be used as the default strategy. This will use all tools available for the specified modeltype. The exact workflow can hence differ for the different model type but a general visualization of this can be seen below:

reevaluation#

The reevaluation strategy is an extension of the ‘default’ strategy. It is defined by the re-running of IIVsearch and RUVsearch. This indicates that the tool follows the exact same principles and the workflow hence is dependent on the model type in question.

The general order of subtools hence become:

SIR#

This strategy is related to ‘SRI’ and ‘RSI’ and is an acronym for running the Structural, IIVsearch and RUVsearch steps of the AMD tool. The workflow hence become as follows:

SRI#

This strategy is related to ‘SIR’ and ‘RSI’ and is an acronym for running the Structural, RUVsearch and IIVsearch steps of the AMD tool. The workflow hence become as follows:

RSI#

This strategy is related to ‘SIR’ and ‘SRI’ and is an acronym for running the RUVsearch, Structural and IIVsearch steps of the AMD tool. The workflow hence become as follows:

Retries#

The retries_strategy argument determines whether the retries tool will be used or not, and on which models. The different options and their description can be seen below. See Retries for more details about the tool.

Strategy	Description
`'all_final'`	Retries tool run on final model from each tool.
`'final'`	Retries tool run only on final model from complete AMD workflow
`'skip'`	No retries are run for any models

When running the tool from AMD, the settings below will be used. If argument seed is set, the chosen seed or random number generator will be used for the random sampling within the tool.

Argument	Setting
`number_of_candidates`	5
`fraction`	0.1
`scale`	‘UCP’
`use_initial_estimates`	False
`prefix_name`	The name of the previously run tool
`seed`	`seed` (As defined in AMD options)

Strategy components#

The subtools that are run in each step, along with their respective arguments, are dependent on the model type given. Below follows a general description of each of the steps. As different model types can perform the same step differently, please see the specific model type page for more details.

Structural#

This component of the AMD run is usually found in the beginning of a strategy and aims to find the best structural model for the specified model type. Oftentimes including structural covariates along with the structure of the compartment system.

Structural covariates are user defined covariate effects that are not tested but rather forcefully added to the input model. These effects are given within the search space in the following way:

COVARIATE(CL, WGT, POW)
COVARIATE?(@IIV, @CATEGORICAL, *)

In this search space, the power covariate effect of WGT on CL is interpreted as a structural covariate (due to the missing “?”) while the other statement is interpreted as an exploratory covariate effect and will be explored in a later covsearch run.

The structural component of an AMD run is heavily dependent on which model type is being analyzed. It is possible that both Modelsearch and Structsearch are used when developing the structural model, e.g. for TMDD and drug metabolite models, Modelsearch will develop the PK structural model and Structsearch will develop the TMDD / drug metabolite model

IIVsearch#

This subtool selects the IIV structure. The tool will find both the number of IIVs and their covariance structure. See IIVsearch for more details about the tool.

Which IIVs that are being added is dependent on the model type. For example, for PKPD models, IIVs are only added to the PD parameters within the model.

Residual#

This subtool selects the residual error model connected to the model. See ruvsearch for more details about the tool.

IOVsearch#

This subtool selects the IOV structure and tries to remove corresponding IIVs if possible, see IOVsearch for more details about the tool. If no argument for occasion is given, this tool will not be run.

Allometry#

This subtool applies allometry to clearance and volume parameters of the inputted model.

Note

Please note that if ignore_datainfo_fallback is set to True and no allometric variable is given, this tool will not be run. See allometry for more details about the tool.

Covariates#

This subtool selects which covariate effects to apply, see COVsearch for more details about the tool.

Covariate effects for this stage are specified in the search space by specifying the effect with a “?”, as the following example suggests:

COVARIATE?(@IIV, @CATEGORICAL, *)

Covariate effects are split into two types at this stage. Mechanistic as well as exploratory covariate effects. Both are to be tested for the model, but the mechanistic covariate effects will be tested in a separate initial covsearch run. These covariates are specified using the mechanistic_covariates argument.

Given the mechanistic covariates mechanistic_covariates = [AGE, (CL,WGT)], the following search space would be evaluated accordingly:

COVARIATE?([CL,V], [AGE, WGT], *)
COVARIATE?(Q, WGT, *)

COVARIATE?([CL,V], AGE, *)
COVARIATE?(CL, WGT, *)

COVARIATE?([V,Q], WGT, *)

Note

Please note that if ignore_datainfo_fallback is set to True and no covariates are given, this tool will not be run.

The search space of effects given to this tool should include all possible (and allowed) covariate effects for the resulting model. This means that covariate effects that are a part of the input model but not the given search space will be removed.

Note

As allometric scaling can be interpreted as a power covariate effect, these effects will be added to the search space to avoid removing them during a covsearch run, if allometry was a part of the strategy.

Results#

The results object contains the final selected model and various summary tables, all of which can be accessed in the results object as well as files in .csv/.json format.

The summary_tool table contains information such as which feature each model candidate has, the difference to the start model (in this case comparing BIC), and final ranking:

	selected_model	description	ofv	dofv	n_params	d_params
tool
start	base	Start model	-2884.045903	0.000000e+00	8	0
modelsearch	final_modelsearch	TRANSITS(1, DEPOT)	-2947.239386	6.319348e+01	10	2
structural_retries	final_structural_retries		-2947.239387	1.109957e-06	10	0
iivsearch	final_iivsearch	[MAT]+[CL,VC]	-2946.993113	-2.462741e-01	9	-1
iivsearch_retries	final_iivsearch_retries		-2946.993113	3.334812e-07	9	0
ruvsearch	final_ruvsearch		-2946.993113	0.000000e+00	9	0
residual_retries	final_residual_retries		-2946.993113	0.000000e+00	9	0
iovsearch	final_iovsearch	IIV([MAT]+[CL,VC]);IOV([MAT]+[CL])	-3058.649490	1.116564e+02	11	2
iovsearch_retries	final_iovsearch_retries		-3058.649491	7.069912e-07	11	0
allometry	final_allometry	Allometry model	-3116.479783	5.783029e+01	11	0
allometry_retries	final_allometry_retries		-3116.479783	5.644783e-07	11	0
covsearch	final_covsearch	(CL-RF-lin)	-3147.542559	3.106278e+01	12	1
covariates_retries	final_covariates_retries		-3147.542560	8.349175e-07	12	0

To see information about the actual model runs, such as minimization status, estimation time, and parameter estimates, you can look at the summary_models table. The table is generated with pharmpy.modeling.summarize_modelfit_results().

			description	minimization_successful	errors_found	warnings_found	ofv	runtime_total	estimation_runtime	POP_CL_estimate	POP_CL_SE	POP_CL_RSE	...	POP_VCDIU_RSE	POP_VCNYHA_estimate	POP_VCNYHA_SE	POP_VCNYHA_RSE	POP_VCRF_estimate	POP_VCRF_SE	POP_VCRF_RSE	POP_VCSEX_estimate	POP_VCSEX_SE	POP_VCSEX_RSE
tool	step	model
modelsearch	0	input		True	0	0	-2884.045903	8.0	1.54	29.1611	1.015640	0.034829	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
	1	modelsearch_run1	ABSORPTION(SEQ-ZO-FO)	True	0	0	-2940.562719	28.0	7.35	29.3101	1.021290	0.034844	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
		modelsearch_run2	ABSORPTION(ZO)	True	0	1	-2896.997407	37.0	23.01	29.4010	0.108359	0.003686	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
		modelsearch_run3	LAGTIME(ON)	True	0	0	-2940.584839	26.0	9.30	29.3363	1.030450	0.035125	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
		modelsearch_run4	PERIPHERALS(1)	True	0	1	-2884.045386	42.0	11.25	29.1691	30.914800	1.059848	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
covariates_retries	1	covariates_retries_run1		True	0	0	-3147.542559	186.0	80.91	27.7217	0.697464	0.025159	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
		covariates_retries_run2		True	0	0	-3147.542560	186.0	82.27	27.7218	0.653186	0.023562	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
		covariates_retries_run3		True	0	0	-3147.542559	255.0	141.29	27.7217	0.669764	0.024160	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
		covariates_retries_run4		True	0	0	-3147.542559	326.0	134.16	27.7217	2.051690	0.074010	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
		covariates_retries_run5		True	0	0	-3147.542559	212.0	110.62	27.7217	0.665942	0.024022	...	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN

156 rows × 142 columns

Finally, you can see a summary of any errors and warnings of the final selected model in summary_errors. See pharmpy.tools.summarize_errors() for information on the content of this table.

			time	message
model	category	error_no

Final model#

Some plots and tables on the final model can be found both in the amd report and in the results object.

	estimates	RSE
POP_CL	27.7218	2.4%
POP_VC	111.1820	2.5%
POP_MAT	0.3565	18.5%
POP_MDT	0.3054	14.9%
POP_CLRF	0.8118	16.4%
IIV_MAT	0.9261	14.1%
IIV_CL	0.1746	13.4%
IIV_CL_IIV_VC	0.7258	16.7%
IIV_VC	0.1661	12.1%
OMEGA_IOV_1	0.2794	29.9%
OMEGA_IOV_2	0.1338	9.2%
sigma	0.2243	2.8%

Save as SVG Save as PNG View Source View Compiled Vega Open in Vega Editor

Examples#

TMDD#

Run AMD for a TMDD model:

from pharmpy.modeling import read_model
from pharmpy.tools read_modelfit_results, run_structsearch

start_model = read_model('path/to/model')

res = run_amd(
            modeltype='tmdd',
            input=start_model,
            search_space='PERIPHERALS([1,2]);ELIMINATION([FO,ZO])',
            dv_types={'drug': 1, 'target': 2, 'complex': 3}
            )

from pharmpy$tools read_modelfit_results, run_structsearch

start_model <- read_model('path/to/model')

res <- run_amd(
            modeltype='tmdd',
            input=start_model,
            search_space='PERIPHERALS([1,2]);ELIMINATION([FO,ZO])',
            dv_types=list('drug'=1, 'target'=2, 'complex'=3)
            )

Note

The name of the DVID column must be “DVID”.

PKPD#

Run AMD for a PKPD model:

from pharmpy.modeling import read_model
from pharmpy.tools read_modelfit_results, run_structsearch

start_model = read_model('path/to/model')

res = run_amd(
            modeltype='pkpd',
            input=start_model,
            search_space='DIRECTEFFECT(*)',
            b_init=0.1,
            emax_init=1,
            ec50_init=0.1,
            met_init=0.4,
            )

from pharmpy$tools read_modelfit_results, run_structsearch

start_model <- read_model('path/to/model')

res <- run_amd(
            modeltype='pkpd',
            input=start_model,
            search_space='DIRECTEFFECT(*)',
            b_init=0.1,
            emax_init=1,
            ec50_init=0.1,
            met_init=0.4,
            )

Note

The input model must be a PK model with a PKPD dataset. The name of the DVID column must be “DVID”.