Create adverse event summary tables to provide high-level safety overview across treatment groups. Learn to calculate AE rates and percentages using Polars and create comprehensive safety summary tables with rtflite.
8.1 Overview
Adverse events (AE) summary tables are critical safety assessments required in clinical study reports. Following ICH E3 guidance, these tables summarize the overall safety profile by showing the number and percentage of participants experiencing various categories of adverse events across treatment groups.
Key categories typically include:
Any adverse event: Total participants with at least one AE
Drug-related events: Events potentially related to study treatment
Serious adverse events: Events meeting regulatory criteria for seriousness
Deaths: Fatal outcomes
Discontinuations: Participants who stopped treatment due to AEs
This tutorial shows you how to create an AE summary table using Python’s rtflite package.
8.2 Step 1: Load data
We need two datasets for AE analysis: the subject-level dataset (ADSL) and the adverse events dataset (ADAE).
AEACN: Action taken with study treatment (e.g., “DOSE NOT CHANGED”, “DRUG WITHDRAWN”, “DOSE REDUCED”)
8.3 Step 2: Filter safety population
For safety analyses, we focus on participants who received at least one dose of study treatment.
# Filter to safety populationadsl_safety = adsl.filter(pl.col("SAFFL") =="Y").select(["USUBJID", "TRT01A"])# Get treatment counts for denominatorspop_counts = adsl_safety.group_by("TRT01A").agg( N = pl.len()).sort("TRT01A")# Preserve the treatment level order for downstream joinstreatment_levels = pop_counts.select(["TRT01A"])# Safety population by treatmentpop_counts
shape: (3, 2)
TRT01A
N
str
u32
"Placebo"
86
"Xanomeline High Dose"
84
"Xanomeline Low Dose"
84
# Join treatment information to AE dataadae_safety = adae.join(adsl_safety, on="USUBJID")# Total AE records in safety populationadae_safety.height
1191
8.4 Step 3: Define AE categories
We’ll calculate participant counts for standard AE categories used in regulatory submissions.
def count_participants(df, condition=None):""" Count unique participants meeting a condition Args: df: DataFrame with adverse events condition: polars expression for filtering (None = count all) Returns: DataFrame with counts by treatment """if condition isnotNone: df = df.filter(condition) counts = df.group_by("TRT01A").agg( n = pl.col("USUBJID").n_unique() )return treatment_levels.join(counts, on="TRT01A", how="left").with_columns( pl.col("n").fill_null(0) )# Calculate each categorycategories = []# 1. Participants in population (no filtering)pop_row = pop_counts.with_columns( category = pl.lit("Participants in population")).rename({"N": "n"})categories.append(pop_row)# 2. With any adverse eventany_ae = count_participants(adae_safety).with_columns( category = pl.lit("With any adverse event"))categories.append(any_ae)
# 5. With serious drug-related adverse eventserious_drug_related = count_participants( adae_safety, (pl.col("AESER") =="Y") & pl.col("AEREL").is_in(["POSSIBLE", "PROBABLE", "DEFINITE", "RELATED"])).with_columns( category = pl.lit("With serious drug-related adverse event"))categories.append(serious_drug_related)# 6. Who dieddeaths = count_participants( adae_safety, pl.col("AEOUT") =="FATAL").with_columns( category = pl.lit("Who died"))categories.append(deaths)# 7. Discontinued due to adverse eventdiscontinued = count_participants( adae_safety, pl.col("AEACN") =="DRUG WITHDRAWN").with_columns( category = pl.lit("Discontinued due to adverse event"))categories.append(discontinued)
8.5 Step 4: Combine and calculate percentages
Now we combine all categories and calculate percentages based on the safety population.
# Combine all categoriesae_summary = pl.concat(categories, how="diagonal")# Add population totals and calculate percentagesae_summary = ae_summary.join( pop_counts.select(["TRT01A", "N"]), on="TRT01A", how="left").with_columns([# Fill missing counts with 0 pl.col("n").fill_null(0),# Calculate percentage pl.when(pl.col("category") =="Participants in population") .then(None) # No percentage for population row .otherwise((100.0* pl.col("n") / pl.col("N")).round(1)) .alias("pct")])ae_summary.sort(["category", "TRT01A"])
shape: (21, 5)
TRT01A
n
category
N
pct
str
u32
str
u32
f64
"Placebo"
0
"Discontinued due to adverse ev…
86
0.0
"Xanomeline High Dose"
0
"Discontinued due to adverse ev…
84
0.0
"Xanomeline Low Dose"
0
"Discontinued due to adverse ev…
84
0.0
…
…
…
…
…
"Xanomeline High Dose"
1
"With serious drug-related adve…
84
1.2
"Xanomeline Low Dose"
1
"With serious drug-related adve…
84
1.2
8.6 Step 5: Format for display
We’ll format the counts and percentages for the final table display.
# Format display valuesae_formatted = ae_summary.with_columns([# Show counts as strings, including zeros pl.col("n").cast(str).alias("n_display"),# Format percentages with parentheses; blank out population row pl.when(pl.col("category") =="Participants in population") .then(pl.lit("")) .otherwise( pl.format("({})", pl.col("pct").fill_null(0).round(1).cast(str)) ) .alias("pct_display")])ae_formatted.select(["category", "TRT01A", "n_display", "pct_display"])
shape: (21, 4)
category
TRT01A
n_display
pct_display
str
str
str
str
"Participants in population"
"Placebo"
"86"
""
"Participants in population"
"Xanomeline High Dose"
"84"
""
"Participants in population"
"Xanomeline Low Dose"
"84"
""
…
…
…
…
"Discontinued due to adverse ev…
"Xanomeline High Dose"
"0"
"(0.0)"
"Discontinued due to adverse ev…
"Xanomeline Low Dose"
"0"
"(0.0)"
8.7 Step 6: Create final table structure
We reshape the data to create the final table with treatments as columns.
# Define category order for consistent displaycategory_order = ["Participants in population","With any adverse event","With drug-related adverse event","With serious adverse event","With serious drug-related adverse event","Who died","Discontinued due to adverse event"]# Pivot to wide formatae_wide = ae_formatted.pivot( values=["n_display", "pct_display"], index="category", on="TRT01A", maintain_order=True)# Reorder columns for each treatment grouptreatments = ["Placebo", "Xanomeline Low Dose", "Xanomeline High Dose"]column_order = ["category"]for trt in treatments: column_order.extend([f"n_display_{trt}", f"pct_display_{trt}"])# Create final table with proper column orderfinal_table = ae_wide.select(column_order).sort( pl.col("category").cast(pl.Enum(category_order)))final_table
shape: (7, 7)
category
n_display_Placebo
pct_display_Placebo
n_display_Xanomeline Low Dose
pct_display_Xanomeline Low Dose
n_display_Xanomeline High Dose
pct_display_Xanomeline High Dose
str
str
str
str
str
str
str
"Participants in population"
"86"
""
"84"
""
"84"
""
"With any adverse event"
"69"
"(80.2)"
"77"
"(91.7)"
"79"
"(94.0)"
"With drug-related adverse even…
"44"
"(51.2)"
"73"
"(86.9)"
"70"
"(83.3)"
…
…
…
…
…
…
…
"Who died"
"2"
"(2.3)"
"1"
"(1.2)"
"0"
"(0.0)"
"Discontinued due to adverse ev…
"0"
"(0.0)"
"0"
"(0.0)"
"0"
"(0.0)"
8.8 Step 7: Generate publication-ready output
Finally, we format the AE summary table for regulatory submission using the rtflite package.
# Get population sizes for column headersn_placebo = pop_counts.filter(pl.col("TRT01A") =="Placebo")["N"][0]n_low = pop_counts.filter(pl.col("TRT01A") =="Xanomeline Low Dose")["N"][0]n_high = pop_counts.filter(pl.col("TRT01A") =="Xanomeline High Dose")["N"][0]doc_ae_summary = rtf.RTFDocument( df=final_table.rename({"category": ""}), rtf_title=rtf.RTFTitle( text=["Analysis of Adverse Event Summary","(Safety Analysis Population)" ] ), rtf_column_header=[ rtf.RTFColumnHeader( text = ["","Placebo","Xanomeline Low Dose","Xanomeline High Dose" ], col_rel_width=[4, 2, 2, 2], text_justification=["l", "c", "c", "c"], ), rtf.RTFColumnHeader( text=["", # Empty for first column"n", "(%)", # Placebo columns"n", "(%)", # Low Dose columns"n", "(%)"# High Dose columns ], col_rel_width=[4] + [1] *6, text_justification=["l"] + ["c"] *6, border_left = ["single"] + ["single", ""] *3, border_top = [""] + ["single"] *6 ) ], rtf_body=rtf.RTFBody( col_rel_width=[4] + [1] *6, text_justification=["l"] + ["c"] *6, border_left = ["single"] + ["single", ""] *3 ), rtf_footnote=rtf.RTFFootnote( text=["Every subject is counted a single time for each applicable row and column." ] ), rtf_source=rtf.RTFSource( text=["Source: ADSL and ADAE datasets"] ))doc_ae_summary.write_rtf("rtf/tlf_ae_summary.rtf")