ColumnInfo#

class pharmpy.model.ColumnInfo(name, type='unknown', unit=1, scale='ratio', continuous=None, categories=None, drop=False, datatype='float64', descriptor=None)[source]#

Bases: Immutable

Information about one data column

Parameters:

name (str) – Colum name
type (str) – Type (see the “type” attribute)
unit (str) – Unit (see the “unit” attribute)
scale (str) – Scale of measurement (see the “scale” attribute)
continuous (bool) – True if continuous or False if discrete
categories (Optional[Union[tuple, dict]]) – Tuple of all possible categories or dict from value to label for each category
drop (bool) – Should column be dropped (i.e. barred from being used)
datatype (str) – Pandas datatype or special Pharmpy datatype (see the “dtype” attribute)
descriptor (str) – Descriptor (kind) of data

Attributes Summary

`categories`	List or dict of allowed categories
`continuous`	Is the column data continuous
`datatype`	Column datatype
`descriptor`	Kind of data
`drop`	Should this column be dropped
`name`	Column name
`scale`	Scale of measurement
`symbol`	Symbol having the column name
`type`	Type of column
`unit`	Unit of the column data

Methods Summary

`convert_datatype_to_pd_dtype`(datatype)	Convert Pharmpy datatype to pandas dtype
`convert_pd_dtype_to_datatype`(dtype)	Convert pandas dtype to Pharmpy datatype
`create`(name[, type, unit, scale, ...])
`from_dict`(d)
`get_all_categories`()	Get a list of all categories
`is_categorical`()	Check if the column data is categorical
`is_integer`()	Check if the column datatype is integral
`is_numerical`()	Check if the column data is numerical
`replace`(**kwargs)	Replace properties and create a new ColumnInfo
`to_dict`()

Attributes Documentation

categories#: List or dict of allowed categories

continuous#

Is the column data continuous

True for continuous data and False for discrete. Note that nominal and ordinal data have to be discrete.

datatype#

Column datatype

datatype	Description	Size	Range	NA allowed?
int8	Signed integer	8 bits	-128 to +127.	No
int16	Signed integer	16 bits	-32,768 to +32,767.	No
int32	Signed integer	32 bits	-2,147,483,648 to +2,147,483,647.	No
int64	Signed integer	64 bits	-9,223,372,036,854,775,808 to 9,223,372,036,854,775,807.	No
uint8	Unsigned integer	8 bits	0 to 256.	No
uint16	Unsigned integer	16 bit	0 to 65,535.	No
uint32	Unsigned integer	32 bit	0 to 4,294,967,295.	No
uint64	Unsigned integer	64 bit	0 to 18,446,744,073,709,551,615	No
float16	Binary float	16 bits	≈ ±6.55×10⁴	Yes
float32	Binary float	32 bits	≈ ±3.4×10³⁸	Yes
float64	Binary float	64 bits	≈ ±1.8×10³⁰⁸	Yes
float128	Binary float	128 bits	≈ ±1.2×10⁴⁹³²	Yes
nmtran-time	NM-TRAN time	n		No
nmtran-date	NM-TRAN date	n		No
str	General string	n		No

The default, and most common datatype, is float64.

descriptor#

Kind of data

descriptor	Description
age	Age (since birth)
body height	Human body height
body surface area	Body surface area (calculated)
body weight	Human body weight
lean body mass	Lean body mass
fat free mass	Fat free mass
time after dose	Time after dose
plasma concentration	Concentration of substance in blood plasma
subject identifier	Unique integer identifier for a subject
observation identifier	Unique integer identifier for an observation
pk measurement	Any kind of PK measurement
pd measurement	Any kind of PD measurement

drop#: Should this column be dropped

name#: Column name

scale#

Scale of measurement

The statistical scale of measurement for the column data. Can be one of ‘nominal’, ‘ordinal’, ‘interval’ and ‘rational’.

symbol#: Symbol having the column name

type#

Type of column

type	Description
id	Individual identifier. Max one per DataFrame. All values have to be unique
idv	Independent variable. Max one per DataFrame.
dv	Observations of the dependent variable
dvid	Dependent variable ID
covariate	Covariate
dose	Dose amount
rate	Rate of infusion
additional	Number of additional doses
ii	Interdose interval
ss	Steady state dosing
event	0 = observation
mdv	0 = DV is observation value, 1 = DV is missing
admid	Administration ID
compartment	Compartment information (not yet exactly specified)
lloq	Lower limit of quantification
blq	Below limit of quantification indicator
unknown	Unkown type. This will be the default for columns that hasn’t been assigned a type

unit#

Unit of the column data

Custom units are allowed, but units that are available in sympy.physics.units can be recognized. The default unit is 1, i.e. without unit.

Methods Documentation

static convert_datatype_to_pd_dtype(datatype)[source]#

Convert Pharmpy datatype to pandas dtype

Parameters:: datatype (str) – String representing a Pharmpy datatype
Returns:: str – String representing a pandas dtype

Examples

>>> from pharmpy.model import ColumnInfo
>>> ColumnInfo.convert_datatype_to_pd_dtype("float64")
'float64'
>>> ColumnInfo.convert_datatype_to_pd_dtype("nmtran-date")
'str'

static convert_pd_dtype_to_datatype(dtype)[source]#

Convert pandas dtype to Pharmpy datatype

Parameters:: dtype (str) – String representing a pandas dtype
Returns:: str – String representing a Pharmpy datatype

Examples

>>> from pharmpy.model import ColumnInfo
>>> ColumnInfo.convert_pd_dtype_to_datatype("float64")
'float64'

classmethod create(name, type='unknown', unit=None, scale='ratio', continuous=None, categories=None, drop=False, datatype='float64', descriptor=None)[source]#

classmethod from_dict(d)[source]#

get_all_categories()[source]#: Get a list of all categories

is_categorical()[source]#

Check if the column data is categorical

Returns:: bool – True if categorical (nominal or ordinal) and False otherwise.