resample_data#

pharmpy.modeling.resample_data(dataset_or_model, group, resamples=1, stratify=None, sample_size=None, replace=False, name_pattern='resample_{}', name=None)[source]#

Iterate over resamples of a dataset.

The dataset will be grouped on the group column then groups will be selected randomly with or without replacement to form a new dataset. The groups will be renumbered from 1 and upwards to keep them separated in the new dataset.

Parameters:
  • dataset_or_model (pd.DataFrame or Model) – Dataset or Model to use

  • group (str) – Name of column to group by

  • resamples (int) – Number of resamples (iterations) to make

  • stratify (str) – Name of column to use for stratification. The values in the stratification column must be equal within a group so that the group can be uniquely determined. A ValueError exception will be raised otherwise.

  • sample_size (int) – The number of groups that should be sampled. The default is the number of groups. If using stratification the default is to sample using the proportion of the strata in the dataset. A dictionary of specific sample sizes for each stratum can also be supplied.

  • replace (bool) – A boolean controlling whether sampling should be done with or without replacement

  • name_pattern (str) – Name to use for generated datasets. A number starting from 1 will be put in the placeholder.

  • name (str) – Option to name pattern in case of only one resample

Returns:

iterator – An iterator yielding tuples of a resampled DataFrame and a list of resampled groups in order