Part I: Python Environment and Toolchain

PharmaDS 2026 Short Course: Python for Clinical Development with AI Applications

Yuting Xu

2026-03-23

Welcome

Outline

Three parts of this short course:

  1. Python environment setup (Yuting)
    Use uv to create and manage reproducible Python projects. Develop and collaborate in GitHub Codespaces or Visual Studio Code.

  2. Python packages for clinical reporting (Yilong)
    Take a guided tour of using Python for TLF creation commonly used in clinical trials and CSR project management.

  3. AI Applications (Eric)
    Explore how AI tools can be applied to clinical data analysis and trial design from a statistician’s perspective.

Note

Interested in R? check https://r4csr.org/

Disclaimer

The views and opinions expressed in this presentation are those of the individual presenters and do not represent those of their affiliated organizations or institutions.


Note

The toolchain, process, and formats may be different in different organizations. We only provide one common way to address them.

Acknowledgements

Python development environment

Integrated Development Environments (IDEs)

Two recommended options:

GitHub Codespaces

  • Cloud-based, pre-configured, convenient for collaboration
  • Linux system with full admin access, so you can install any tools you need
  • No local setup needed, only requires a GitHub account and a browser
  • Free usage 120 hours/month
  • Each codespace can be linked to an existing GitHub repo or empty template

VS Code

  • Most popular choice
  • Rich extension ecosystem for all programming languages and tools
  • Leading platform for GitHub Copilot that provides the most feature-rich and integrated experience

GitHub Codespaces

Getting started:

  • Option 1. Navigate to any GitHub repository
    • Press . (period) to open VS Code in the browser, or
    • Click button: Code → Codespaces → Create codespace on main (or any other) branch
  • Option 2. GitHub Codespace Portal -> New codespace -> Select repo and branch

What you get:

  • Access to Linux system (Ubuntu) with a VS Code web interface from any device with a browser, or you can connect to it with your local VS Code desktop app
  • Pre-installed tools can be defined in .devcontainer/devcontainer.json, such as uv, specific VS Code extensions, and any command-line tools
  • Persistent storage for your workspace files

Free tier:

  • 120 core-hours/month on the free individual GitHub plan
    • A 2-core machine uses 2 core-hours per hour → 60 hours/month
  • Do not forget to stop and/or delete your codespace when not in use to reduce costs!

VS Code

Installation

  • Download and install from VS Code website
  • Install the Language Runtime:
  • Install the corresponding VS Code Extensions:
    • Python extension from Microsoft
    • R extension from REditorSupport

Additional Recommended Extensions

  • GitHub Copilot Chat: AI code assistant
  • Python; R; Quarto; Jupyter: Language support and integration
  • Git Graph, GitLens: Git visualization and management
  • Black Formatter, Ruff: Code formatting and linting
  • mypy, Pylance: Type checking

Alternative IDEs

Positron

  • Posit’s next-gen IDE, built on Code OSS (the core of VS Code)
  • Primarily support R and Python
  • Native notebook support
  • Built-in data viewer (similar four-panel layout as RStudio)
  • Built-in AI assistant with native GitHub Copilot integration

JupyterLab

  • Popular in data science community
  • Interactive computing environment
  • Supports code, data/figure visualization and documentation in cell-by-cell notebooks
  • Additional kernels can be installed to work with a wide range of other programming languages

Python package and project manager: uv

Python environment management

Python projects rely heavily on rapidly evolving open-source libraries that often have conflicting version requirements.

  • In contrast, R often uses a single global library for all projects

  • Python package and project managers are essential because they prevent version conflicts by isolating each project’s specific dependencies into its own dedicated environment.

Key goal: Ensure reproducibility across different machines/platforms and over time, which is critical for regulatory submissions!

What we need

  • Package & Dependency Management

    • Find, install, and track external libraries and their dependencies
    • Resolve any version conflicts between them
  • Environment Management

    • Python Version Management: Specify, install, and switch between different versions of Python
    • Virtual Environments: Create isolated environments for each project to prevent conflicts and ensure reproducibility
  • Project Workflow & Automation

    • Standardized Configuration: Document project metadata, scripts, and tool settings.
    • Regular development tasks: Run your test code or linters
    • Project Initialization: Create new projects with a standard directory layout and configuration files.
  • Python package building & distribution

    • Packaging your code into standard distribution formats
    • Publishing to package repositories like PyPI

Why uv?

uv is a modern Python package and project manager written in Rust.

All-in-One tool that handles Python environments, package dependencies, project maintenance, builds and publishing in a single workflow.

Replaces scattered toolchain:

  • pip + venv + pyenv + pip-tools + setuptools + poetry

Additional benefits:

  • Fast: 10-100x faster than traditional Python package managers like pip
  • Modern: Uses standard format pyproject.toml as the single source of truth, which prevents conflicting information scattered across multiple files
  • Reliable: Automatically creates or updates the universal cross-platform lockfile uv.lock for to ensure reproducible and deterministic environments
  • Independence: Has its own official standalone installer, coexist with Homebrew and the broader macOS ecosystem better than traditional tools like pyenv or Poetry

Installing uv

Note

Skip if using GitHub Codespaces: uv is pre-installed there.


macOS/Linux:

curl -LsSf https://astral.sh/uv/install.sh | sh

Windows:

powershell -c "irm https://astral.sh/uv/install.ps1 | iex"

Verify:

uv --version

More installation options here.

Quick start with uv - New project

Create a new project

uv init toyexample

Check the created files:

cd toyexample
ls -al

Create the first commit:

git status
git add --all
git commit -m "Initialize project with uv"

Pin Python version

uv python pin 3.14.0

List installed packages (empty at this point)

uv pip list

Quick start with uv - Package management

Add dependencies

uv add polars plotnine rtflite

Add dev dependencies

uv add --dev ruff mypy pytest pytest-cov

Manually edit pyproject.toml to add additional dependencies or dev-dependencies sections, e.g. dependencies = ["scikit-learn", "seaborn","dash"], then try the following commands -

Resolves dependencies:

uv lock --check  # check whether uv.lock is out-of-sync with pyproject.toml
uv lock # update the lock file with the new dependencies

Sync environment:

uv sync # install the new dependencies

Quick start with uv - Using the virtual environment

Revise the main.py file to test the installed packages, e.g.

import rtflite

def main():
    print("rtflite version:", rtflite.__version__)

Try running the two commands below -

uv run python main.py
python main.py

Activate the virtual environment installed by uv

source /workspace/codespaces-blank/toyexample/.venv/bin/activate

Deactivate the virtual environment when you are done

deactivate

Quick start with uv - Additional commands

Update uv itself

uv self update

Upgrade packages and sync environment

uv lock --upgrade-package plotnine # only updates the uv.lock file and pyproject.toml
uv sync # update the environment with the new plotnine version

Uninstall packages

uv remove scikit-learn seaborn dash

For more details, refer to the uv official guide

Alternatives

Conda

  • Manages Python versions and packages by creating isolated environments
  • Popular in data science, since it handles both Python and non-Python dependencies (e.g., C++ libraries, CUDA, or R)
  • Not as fast or modern as uv
  • Recommend Miniconda (the official minimal installer from Anaconda, Inc.) or Miniforge (community-driven alternative installer, 100% free and faster)

Poetry

  • Python package manager that handles dependency, packaging and publishing
  • Uses pyproject.toml
  • Not as comprehensive as uv for environment management
  • Long-standing and widely used in the Python community: Mature, stable and well-documented

Python toolchain essentials

Typical Development Workflow

  • Code formatting and linting: Improves code format quality, readability, consistency, and catches errors early
  • Verify data/variable types: Improves functional safety
  • Run tests and check coverage: Validates code behavior, ensures correctness, and identifies untested code
  • Generate documentation or reproducible user guides: Facilitate collaboration and application
  • Use Git for version control and collaboration: Ensure all the changes are tracked and well-documented, and enable seamless collaboration across teams

Summary

Key concepts

Virtual environments are mandatory in Python

  • Isolate project dependencies
  • Prevent conflicts
  • Enable reproducibility

Dependency locking

  • uv.lock pins exact versions
  • Ensures reproducible environments
  • Similar to R’s renv.lock

.python-version file

  • Specifies exact Python version (for example, 3.14.0)
  • Critical for regulatory submissions

Action Items

Pre-configured GitHub Codespaces:

What’s pre-installed:

  • Python 3.14 + uv
  • polars, plotnine, rtflite, and other clinical reporting related packages
  • Quarto for document rendering
  • Ruff for linting and formatting

Workflow inside Codespaces:

# Dependencies are already synced on creation
uv run python scripts/tlf_t_ae.py   # run a script
uv run quarto render report.qmd     # render a report
uv run pytest tests/                # run tests

eBook: Python for Clinical Study Reports and Submission: pycsr.org

Resources

Regulatory:

Technical:

Break (5 min)

Optional hands-on exercise: Repeat the quick start steps to create a new project and install dependencies with uv in an empty GitHub Codespaces or on your own laptop.

Appendix

MiniConda and Miniforge

  • Share exactly the same syntax as conda, but with different channels and licensing
    • Miniconda looks for packages in the defaults channel (Anaconda’s repo).
    • Miniforge looks for packages in the conda-forge channel (Community repo).
  • Miniconda requires a paid commercial license for organizations with more than 200 employees (still free for individuals, education system and smaller businesses)
  • Basic conda commands (For more information, see Conda Cheat Sheet)
conda create -n myenv python=3.14
conda activate myenv
conda install numpy pandas matplotlib
conda install -c conda-forge polars plotnine
pip install rtflite
conda list
conda update numpy
conda remove matplotlib
conda env export > environment.yml
conda deactivate
conda info --envs // conda env list
conda env remove --name myenv

Note

In miniforge, you can swap conda for mamba for faster speed, e.g. mamba install pandas

Python package and project manager: Traditional & Specialized Tools

  • pip: The standard, built-in package installer for Python, essential for downloading packages from PyPI.
  • Pipenv: Combines pip, pipfile, and virtualenv into a single toolchain, focusing on secure, locked dependency management.
  • Poetry: Focuses on deterministic dependency management and packaging, using pyproject.toml for projects and publishing.
  • Conda: A popular cross-platform manager (often used in data science) that handles Python packages and non-Python dependencies.
  • pipx: Specialized in installing and running Python application CLI tools, keeping them isolated from project dependencies.

Key Concepts in Python Packaging

  • pyproject.toml: The modern, standard file for configuring build systems and defining metadata/dependencies.
  • Virtual Environments (venv): Isolated environments to prevent dependency conflicts between projects.
  • PyPI (Python Package Index): The repository for all open-source Python packages.
  • Modules & Packages: Structured code (files and directories) that enable reusability.

For modern projects, uv is heavily recommended for its efficiency, while pip remains the standard for basic needs.