Modelsearch#

The Modelsearch tool is a general tool to decide the best structural model given a base model and a search space of model features. The tool supports different algorithms and selection criteria.

Running#

The modelsearch tool is available both in Pharmpy/pharmr and from the command line.

To initiate modelsearch in Python/R:

from pharmpy.modeling import read_model
from pharmpy.tools import read_modelfit_results, run_modelsearch

start_model = read_model('path/to/model')
start_model_results = read_modelfit_results('path/to/model')
res = run_modelsearch(search_space='ABSORPTION([FO,ZO]);PERIPHERALS([0,1]);LAGTIME(ON)',
                      algorithm='reduced_stepwise',
                      model=start_model,
                      results=start_model_results,
                      iiv_strategy='absorption_delay',
                      rank_type='bic',
                      cutoff=None)

This will take an input model model with search_space as the search space, meaning adding one peripheral compartment and lagtime will be tried. The tool will use the ‘reduced_stepwise’ search algorithm. IIVs on structural parameters (such as mean absorption time) will not be added to candidates since iiv_strategy is set to be ‘absorption_delay’. The candidate models will have BIC as the rank_type with default cutoff, which for BIC is None/NULL.

To run modelsearch from the command line, the example code is redefined accordingly:

pharmpy run modelsearch path/to/model 'PERIPHERALS(1);LAGTIME(ON)' 'reduced_stepwise' --iiv_strategy 'absorption_delay' --rank_type 'bic'

Arguments#

For a more detailed description of each argument, see their respective chapter on this page.

Mandatory#

Argument

Description

search_space

Search space to test

algorithm

Algorithm to use (e.g. 'reduced_stepwise')

model

Start model

results

ModelfitResults of the start model

Optional#

Argument

Description

rank_type

Which selection criteria to rank models on, e.g. OFV (default is BIC)

cutoff

Cutoff for the ranking function, exclude models that are below cutoff (default is None/NULL)

iiv_strategy

If/how IIV should be added to candidate models (default is to add to absorption delay parameters). See Adding IIV to the candidate models during search

strictness

Strictness criteria for model selection. Default is “minimization_successful or (rounding_errors and sigdigs>= 0.1)”

The search space#

The model feature search space is a set of all possible combinations of model features that is allowed for the final model. The supported features cover absorption, absorption delay, elimination, and distribution. The search space is given as a string with a specific grammar, according to the Model Feature Language (MFL) (see detailed description). If an attribute is not given, the default value for that attribute will be used as seen below. If the input model is not part of the given search space, a base model will be created. This is done by performing the least amount of transformations to the input model in order to make the base model a part of the given search space.

Category

DEFAULT

ABSORPTION

INST

ELIMINATION

FO

PERIPHERALS

0

TRANSITS

0, DEPOT

LAGTIME

OFF

The logical flow for the creation of the base model can be seen below. In summary, given an input model and a search space, the first step is to examine if the input model is a part of the search space. If so, the model features for the input model is filtered from the search space. As these are already present in the input model, they are not needed in the search space. After filtration, all transformations that are left will be examined. However, if the input model is not part of the search space, the base model is created by which will be part of the search space. Following this, the model features from the base model is filtered from the search space which leaves the transformations left to be examined.

digraph G { splines = false input_model [ label = "Input model"; ]; search_space [ label = "Search space"; ]; input_ss [ label = "Input + search space"; ]; filter_model [ label = "Filter model features from search space"; ]; create_base [ label = "Create a base"; ]; input_model -> input_ss; search_space -> input_ss; input_ss -> create_base [label = "Input model ∉ search space"]; create_base -> filter_model; input_ss -> filter_model [label = "Input model ∈ search space"]; }

Some examples of this workflow :

Search space

Input model

Base model

Transformations to apply

ABSORPTION([FO,ZO]) ELIMINATION(FO) PERIPHERALS([1,2])

ABSORPTION(FO) ELIMINATION(ZO) TRANSITS(0) PERIPHERALS(0) LAGTIME(ON)

ABSORPTION(FO) ELIMINATION(FO) TRANSITS(0) PERIPHERALS(1) LAGTIME(OFF)

ABSORPTION(ZO) PERIPHERALS(2)

ABSORPTION(FO) ELIMINATION(FO) TRANSITS([1,2])

ABSORPTION(FO) ELIMINATION(ZO) TRANSITS(0) PERIPHERALS(2) LAGTIME(OFF)

ABSORPTION(FO) ELIMINATION(FO) TRANSITS(1) PERIPHERALS(0) LAGTIME(OFF)

TRANSITS(2)

ABSORPTION([FO,ZO]) ELIMINATION([FO,ZO,MM]) PERIPHERALS([0,1,2]) LAGTIME([OFF,ON])

ABSORPTION(FO) ELIMINATION(FO) TRANSITS(0) PERIPHERALS(0) LAGTIME(OFF)

Not needed since input model is part of search space

ABSORPTION(ZO) ELIMINATION([ZO,MM]) PERIPHERALS([1,2] LAGTIME(ON)

Algorithms#

The tool can conduct the model search using different algorithms. The available algorithms can be seen in the table below.

Algorithm

Description

'exhaustive'

All possible combinations of the search space are tested

'exhaustive_stepwise'

Add one feature in each step in all possible orders

'reduced_stepwise'

Add one feature in each step in all possible orders. After each feature layer, choose best model between models with same features

Common behaviours between algorithms#

Feature combination exclusions#

Some combinations of features are excluded in algorithms that are performed stepwise, the following combinations are never run:

Feature A

Feature B

ABSORPTION(ZO)

TRANSITS

ABSORPTION(SEQ-ZO-FO)

TRANSITS

ABSORPTION(SEQ-ZO-FO)

LAGTIME(ON)

ABSORPTION(INST)

LAGTIME(ON)

ABSORPTION(INST)

TRANSITS

LAGTIME(ON)

TRANSITS

Additionally, peripheral compartments are always run sequentially, i.e. the algorithm will never add more than one compartment at a given step. This is done in order to allow for better initial estimates from previous peripherals.

Comparing and ranking candidates#

The supplied rank_type will be used to compare a set of candidate models and rank them. Each candidate model will be compared to the input model. A cutoff may also be provided if the user does not want to use the default. The following rank functions are available:

Rank type

Description

'ofv'

ΔOFV. Default is to not rank candidates with ΔOFV < cutoff (default 3.84)

'aic'

ΔAIC. Default is to rank all candidates if no cutoff is provided.

'bic'

ΔBIC (mixed). Default is to rank all candidates if no cutoff is provided.

Information about how BIC is calculated can be found in pharmpy.modeling.calculate_bic().

The Modelsearch results#

The results object contains various summary tables which can be accessed in the results object, as well as files in .csv/.json format. The name of the selected best model (based on the input selection criteria) is also included.

Consider a modelsearch run with the search space of zero and first order absorption, adding zero or one peripheral compartment and lagtime:

res = run_modelsearch(search_space='ABSORPTION([FO,ZO]);PERIPHERALS([0,1]);LAGTIME(ON)',
                      algorithm='reduced_stepwise',
                      model=start_model,
                      results=start_model_results,
                      iiv_strategy='absorption_delay',
                      rank_type='bic',
                      cutoff=None)

The summary_tool table contains information such as which feature each model candidate has, the difference to the start model (in this case comparing BIC), and final ranking:

description n_params d_params dbic bic rank parent_model
model
base LAGTIME(ON) 8 0 0.000000 -1273.792080 1.0 base
modelsearch_run2 PERIPHERALS(1) 11 3 -3.305500 -1270.486580 2.0 base
modelsearch_run1 ABSORPTION(ZO) 9 1 -9.452187 -1264.339892 3.0 base
modelsearch_run3 ABSORPTION(ZO);PERIPHERALS(1) 11 3 NaN NaN NaN modelsearch_run1
modelsearch_run4 PERIPHERALS(1);ABSORPTION(ZO) 11 3 NaN NaN NaN modelsearch_run2

To see information about the actual model runs, such as minimization status, estimation time, and parameter estimates, you can look at the summary_models table. The table is generated with pharmpy.tools.summarize_modelfit_results().

description minimization_successful errors_found warnings_found ofv runtime_total estimation_runtime POP_CL_estimate POP_CL_SE POP_CL_RSE ... POP_MDT_RSE IIV_MDT_estimate IIV_MDT_SE IIV_MDT_RSE POP_QP1_estimate POP_QP1_SE POP_QP1_RSE POP_VP1_estimate POP_VP1_SE POP_VP1_RSE
step model
0 input True 0 0 -1292.186761 4.0 0.10 24.5328 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 base LAGTIME(ON) True 0 0 -1313.362311 6.0 0.20 24.2927 NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
modelsearch_run1 ABSORPTION(ZO) True 0 1 -1305.577305 8.0 0.31 24.1880 NaN NaN ... NaN 0.000001 NaN NaN NaN NaN NaN NaN NaN NaN
modelsearch_run2 PERIPHERALS(1) True 0 1 -1325.551467 8.0 0.67 24.7779 NaN NaN ... NaN 0.000001 NaN NaN 96.4011 NaN NaN 47.4555 NaN NaN
modelsearch_run3 ABSORPTION(ZO);PERIPHERALS(1) False 2 1 -1308.965498 9.0 1.08 24.2319 NaN NaN ... NaN 0.000001 NaN NaN 1098.1900 NaN NaN 77.8857 NaN NaN
modelsearch_run4 PERIPHERALS(1);ABSORPTION(ZO) False 2 0 -1308.040354 7.0 0.96 24.1760 NaN NaN ... NaN 0.011456 NaN NaN 766.4720 NaN NaN 56.9327 NaN NaN

6 rows × 40 columns

Finally, you can see a summary of different errors and warnings in summary_errors. See pharmpy.tools.summarize_errors() for information on the content of this table.

time message
model category error_no
modelsearch_run1 WARNING 0 2024-10-22 13:09:42.519 PARAMETER ESTIMATE IS NEAR ITS BOUNDARY
modelsearch_run2 WARNING 0 2024-10-22 13:09:43.936 PARAMETER ESTIMATE IS NEAR ITS BOUNDARY
modelsearch_run3 ERROR 1 2024-10-22 13:09:52.316 MINIMIZATION TERMINATED\nDUE TO MAX. NO. OF FUNCTION EVALUATIONS EXCEEDED
2 2024-10-22 13:09:52.316 NO. OF SIG. DIGITS UNREPORTABLE
WARNING 0 2024-10-22 13:09:52.316 PARAMETER ESTIMATE IS NEAR ITS BOUNDARY
modelsearch_run4 ERROR 0 2024-10-22 13:09:53.750 MINIMIZATION TERMINATED\nDUE TO MAX. NO. OF FUNCTION EVALUATIONS EXCEEDED
1 2024-10-22 13:09:53.750 NO. OF SIG. DIGITS UNREPORTABLE