Project suggestions#

Here are some suggestions for projects suitable for Google Summer of Code, Master Thesis Work or other types of internships. If you are a student or otherwise interested feel free to drop any of the maintainers an email. In our team we are using Python, tox and pytest for testing, github actions for continuous integration and sphinx for documentation.

Step wise covariate search (scm)

Implement a known step wise search algorithm and workflow for covariate search in a model.

Covariates are factors such as body weight and sex that help explain

variability between individuals in for drug trial data.

This will involve using the Pharmpy API to create a workflow.
- Outcomes: A tool that could be directly used by researchers
- Skills: Python, pandas
- Size: 350h
- Difficulty: Medium
- Mentors: Rikard Nordgren, Stella Belin and/or Zhe Huang
Estimation tool

Implement the first order (FO) non-linear mixed effects model parameter estimation method.

This will involve connecting sympy expressions with data in pandas and using

scipy for optimization. The main focus of the project will be on the function optimization part.
- Outcomes: A tool that will serve as a proof of concept and also to simplify internal testing
- Skills: Python, pandas, sympy, scipy, linear algebra, optimization
- Size: 350h
- Difficulty: Hard
- Mentors: Rikard Nordgren, Stella Belin and/or Zhe Huang
Simulation tool

Design interface for a simulation tool and create one or possibly two implementations. One using some

external software, e.g. NONMEM and one implemented from scratch. Simulation in this context is to

generate observations (for example plasma drug concentrations) from a model.
- Outcomes: A simulation tool that will simplify simulation for researchers
- Skills: Python, pandas
- Size: 350h
- Difficulty: Medium
- Mentors: Rikard Nordgren, Stella Belin and/or Zhe Huang
Switch to using symengine instead if sympy for statements

Currently sympy is used for almost all handling of mathematical expressions.

Since symengine is mostly compatible with sympy and faster it would be

beneficial to try to use symengine where applicable.
- Outcomes: Faster reading and handling of models
- Skills: Python, sympy
- Size: 175h
- Difficulty: Easy
- Mentors: Rikard Nordgren, Stella Belin and/or Zhe Huang
Allow using median() and mean() in symbolic expressions

A sympy meets pandas task. Allow custom functions in sympy

expressions. These functions should be possible to evaluate

numerically. For example “median(WGT) * theta” where “WGT”

is available in a DataFrame.
- Outcomes: More powerful input expressions for researchers
- Skills: Python, sympy, pandas
- Size: 175h
- Difficulty: Easy
- Mentors: Rikard Nordgren, Stella Belin and/or Zhe Huang
Model diff

Create a diff tool to see differences between two models on different levels

On model code level (text), on Pharmpy object level and potentially on a high human readable level

separating different high level model concepts.
- Outcomes: One or more model diff tools
- Skills: Python
- Size: 175h-350h
- Difficulty: Easy-Hard
- Mentors: Rikard Nordgren, Stella Belin and/or Zhe Huang
Compare plotting in various plotting libraries

Implement different standard plots to compare different libraries. Different areas of comparison

could be ease of use, serialization of plots, interactivity and rendering in Jupyter and Rstudio.

Examples of plotting libraries to explore are altair, holoviews and bokeh.
- Outcomes: Example plots using various tools and a report on their suitability
- Skills: Python, pandas, plotting
- Size: 175h
- Difficulty: Easy
- Mentors: Rikard Nordgren, Stella Belin and/or Zhe Huang
Monitor ongoing workflows

Some workflow take a very long time to run and researchers would benefit from being able to

monitor what is happening. For example convergence of a parameter estimation. We would

like to develop some sort of dashboard that could complement the dask dashboard with

tool specific information that gets updated in realtime.
- Outcomes: A dashboard
- Skills: Python, dask, plotting
- Size: 350h
- Difficulty: Medium
- Mentors: Rikard Nordgren, Stella Belin and/or Zhe Huang
Restartable workflows

Long running workflows might for different reasons fail in the middle. Currently worflows will have

to be restarted after fixing the cause of failure. We would like to be able to restart workflows

and use partial results and continue.
- Outcomes: A general idea for restartability and one or more workflows becoming restartable.
- Skills: Python, dask
- Size: 350h
- Difficulty: Hard
- Mentors: Rikard Nordgren, Stella Belin and/or Zhe Huang