3 Python package toolchain

Objective

Learn the essential development tools for Python projects: formatting, linting, type checking, testing, and documentation. Build a professional development workflow for clinical reporting.

3.1 The modern Python toolchain

In R, packages like devtools, usethis, styler, lintr, and testthat provide development infrastructure. Python’s ecosystem distributes these functions across specialized tools.

For clinical reporting projects, we recommend:

uv: Package and environment management.
Ruff: Code formatting and linting.
mypy: Static type checking.
pytest: Unit testing framework.
quartodoc: Documentation and reporting.

For R users, think of this as: uv = renv + pak + devtools, Ruff = styler + lintr, pytest = testthat, mypy = (no direct R equivalent).

All tools are installed as development dependencies and configured through pyproject.toml.

3.2 Ruff: Formatting and linting

Ruff is an super fast linter and formatter written in Rust. It replaces multiple legacy tools (Black, isort, Flake8, pyupgrade) with a single, consistent interface.

3.2.1 Installation

Add Ruff as a development dependency:

uv add --dev ruff

3.2.2 Code formatting

Format your code:

uv run ruff format

Or using uvx:

uvx ruff format

Ruff format:

Enforces consistent style (like Black).
Sorts imports automatically.
Removes trailing whitespace.
Ensures consistent line lengths.

3.2.3 Linting

Check for linting issues:

uv run ruff check

Fix auto-fixable issues:

uv run ruff check --fix

Ruff detects:

Unused imports and variables.
Undefined names.
Style violations.
Common anti-patterns.
Security issues.

3.2.4 Configuration

Add Ruff configuration to pyproject.toml:

[tool.ruff]
line-length = 88
target-version = "py313"

[tool.ruff.format]
quote-style = "double"
indent-style = "space"

[tool.ruff.lint]
select = [
    "E",    # pycodestyle
    "F",    # Pyflakes
    "UP",   # pyupgrade
    "B",    # flake8-bugbear
    "SIM",  # flake8-simplify
    "I",    # isort
]
ignore = []

Note

Line length of 88 characters is the Python community standard. It balances readability with modern screen sizes.

3.3 Type checking with mypy

Python supports optional type annotations through PEP 484. Type annotations improve code clarity and catch errors before runtime.

3.3.1 Why type checking matters

For clinical programming:

Catch data transformation errors at development time.
Document expected DataFrame structures.
Improve IDE autocomplete and refactoring.
Reduce runtime errors in production.

3.3.2 Installation

Add mypy as a development dependency:

uv add --dev mypy

3.3.3 Basic usage

Check types in your code:

uv run mypy .

3.3.4 Type annotation example

Without types:

def calculate_bmi(weight, height):
    return weight / (height ** 2)

With types:

def calculate_bmi(weight: float, height: float) -> float:
    """Calculate BMI from weight (kg) and height (m)."""
    return weight / (height ** 2)

The type checker verifies:

Arguments are the correct type.
Return value matches the declared type.
Operations are valid for the types used.

3.3.5 Configuration

Add mypy settings to pyproject.toml:

[tool.mypy]
python_version = "3.14"
warn_return_any = true
warn_unused_configs = true
disallow_untyped_defs = false
disallow_incomplete_defs = true
check_untyped_defs = true
no_implicit_optional = true

Start with lenient settings (disallow_untyped_defs = false) and progressively tighten as you add type annotations to your codebase.

3.3.6 Type stubs for libraries

Some libraries don’t include type information. Install type stubs when available:

uv add --dev types-tabulate

Note

Popular data science libraries like polars include built-in type annotations. Older libraries like pandas require separate stub packages (pandas-stubs).

3.4 Testing with pytest

pytest is Python’s de facto standard testing framework. It’s more powerful and ergonomic than the built-in unittest module.

3.4.1 Installation

Add pytest and coverage tools:

uv add --dev pytest pytest-cov

3.4.2 Writing tests

Create a tests/ directory:

pycsr-example/
├── src/
│   └── pycsr_example/
│       └── __init__.py
└── tests/
    └── test_calculations.py

Write a simple test in tests/test_calculations.py:

from pycsr_example.calculations import calculate_bmi
import pytest

def test_calculate_bmi():
    # Normal BMI calculation
    assert calculate_bmi(70, 1.75) == pytest.approx(22.857142857142858)

def test_calculate_bmi_underweight():
    # BMI < 18.5 indicates underweight
    assert calculate_bmi(50, 1.75) < 18.5

3.4.3 Running tests

Run all tests:

uv run pytest

Run with verbose output:

uv run pytest -v

Run specific test file:

uv run pytest tests/test_calculations.py

3.4.4 Code coverage

Generate coverage report:

uv run pytest --cov=pycsr_example --cov-report=term

Generate HTML coverage report:

uv run pytest --cov=pycsr_example --cov-report=html

This creates htmlcov/index.html showing which lines are tested.

Important

For regulatory submissions, high test coverage demonstrates code quality. Aim for >80% coverage for critical data transformation and statistical computation functions.

3.4.5 pytest configuration

Add pytest settings to pyproject.toml:

[tool.pytest.ini_options]
testpaths = ["tests"]
python_files = ["test_*.py"]
python_functions = ["test_*"]
addopts = [
    "--strict-markers",
    "--strict-config",
    "-ra",
]

3.5 Documentation generation

For clinical reporting projects, documentation serves two purposes:

Code documentation: Function and module documentation.
Report generation: Analysis reports and TLFs.

3.5.1 Quarto for reports

We use Quarto for creating reproducible analysis documents:

# Install Quarto separately (not via uv)
# See: https://quarto.org/docs/get-started/

Quarto documents (.qmd files) combine:

Markdown text.
Python code cells.
Generated outputs (tables, listings, figures).

This book itself is written in Quarto.

3.5.2 quartodoc for API documentation

For packages that need API documentation (similar to R’s pkgdown), use quartodoc:

uv add --dev quartodoc

quartodoc generates documentation from docstrings and integrates with Quarto for full website generation.

For analysis projects (rather than reusable packages), Quarto alone is usually sufficient. Use quartodoc when building analysis packages for team to collaborate on.

3.6 Development workflow

Putting it all together, a typical development cycle looks like:

Format code: uv run ruff format
Check linting: uv run ruff check --fix
Verify types: uv run mypy .
Run tests: uv run pytest --cov=pycsr_example
Generate reports: quarto render

3.6.1 Pre-commit automation

You can automate these checks using Git hooks (not covered in this book), but manual execution provides better learning and control during development.

3.7 Clinical project structure guidelines

In case you need clinical reporting projects using both R and Python:

Separate R and Python directories:

project/
├── r-package/          # R package for R-based analyses
│   ├── DESCRIPTION
│   ├── R/
│   └── tests/
├── python-package/     # Python package for Python-based analyses
│   ├── pyproject.toml
│   ├── src/
│   └── tests/
├── data/               # Shared input data (SDTM, ADaM)
└── output/             # Shared output (TLFs, reports)

Why separate? There are a few reasons:

Different build systems.
Different dependency management.
Different testing frameworks.
Different IDE configurations.

Shared resources:

Input datasets (SDTM, ADaM) can be in a common data/ directory.
Output deliverables can go to a common output/ directory.
Documentation can reference both implementations.

Note

For this book, we focus exclusively on Python. Mixed R/Python workflows are beyond scope but follow the same principles.

3.8 Exercise

Set up a complete development environment:

Create a new project with uv init dev-practice.
Add development dependencies: ruff, mypy, pytest, pytest-cov.

Create a simple function in src/dev_practice/stats.py:

def mean(values: list[float]) -> float:
    return sum(values) / len(values)

Write a test in tests/test_stats.py.
Run Ruff format and check.
Run mypy type checking.
Run pytest with coverage.

View solution

# Create project
uv init dev-practice
cd dev-practice

# Add dev dependencies
uv add --dev ruff mypy pytest pytest-cov

# Add build system to pyproject.toml
cat >> pyproject.toml << 'EOF'

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
EOF

# Create stats module
mkdir -p src/dev_practice
touch src/dev_practice/__init__.py
cat > src/dev_practice/stats.py << 'EOF'
def mean(values: list[float]) -> float:
    """Calculate the arithmetic mean of a list of numbers."""
    if not values:
        raise ValueError("Cannot calculate mean of empty list")
    return sum(values) / len(values)
EOF

# Create test file
mkdir -p tests
cat > tests/test_stats.py << 'EOF'
import pytest
from dev_practice.stats import mean

def test_mean_basic():
    assert mean([1.0, 2.0, 3.0]) == 2.0

def test_mean_single_value():
    assert mean([5.0]) == 5.0

def test_mean_empty_raises():
    with pytest.raises(ValueError):
        mean([])
EOF

# Run checks
uv run ruff format .
uv run ruff check .
uv run mypy src/
uv run pytest --cov=dev_practice --cov-report=term

Expected output from pytest:

=================================== test session starts ===================================
platform darwin -- Python 3.14.0, pytest-9.0.1, pluggy-1.6.0
rootdir: /Users/user/dev-practice
configfile: pyproject.toml
plugins: cov-7.0.0
collected 3 items

tests/test_stats.py ...                                                             [100%]

===================================== tests coverage ======================================
____________________ coverage: platform darwin, python 3.14.0-final-0 _____________________

Name                           Stmts   Miss  Cover
--------------------------------------------------
src/dev_practice/__init__.py       0      0   100%
src/dev_practice/stats.py          4      0   100%
--------------------------------------------------
TOTAL                              4      0   100%
==================================== 3 passed in 0.02s ====================================

3.9 Example repositories

Demo project repositories have been created:

Python package example: https://github.com/elong0527/demo-py-esub
eCTD package example: https://github.com/elong0527/demo-py-ectd

With the knowledge from this chapter, you can understand how these projects are organized and develop similar professional Python packages for clinical reporting.

3.10 What’s next

You now have a complete Python development environment with:

uv for project and dependency management.
Ruff for code quality.
mypy for type safety.
pytest for testing.
Quarto for documentation.

Next part will introduce how to create real clinical study reports, demonstrating TLF generation with polars and rtflite.