Transitioning from Clinical SAS to R can feel daunting, especially when working with Tables, Listings, and Graphs (TLG)—the backbone of clinical study reporting. While SAS offers PROC REPORT and ODS for TLGs, R provides flexible and powerful alternatives like dplyr, tables, gt, and ggplot2.
This chapter will guide you through R’s ecosystem for generating tables, listings, and visualizations, mirroring common SAS workflows. By the end, you’ll be able to:
- Create summary tables (like PROC TABULATE or PROC REPORT).
- Generate listings (similar to PROC PRINT).
- Build publication-ready graphs (akin to SGPLOT or GTL).
Let’s dive in!
Clinical trials often require descriptive tables (demographics, adverse events) or summary tables (efficacy endpoints). Below are R alternatives to SAS procedures.
dplyr and tablesIn SAS, you might use PROC MEANS or PROC FREQ for summaries. In R, dplyr (for data wrangling) and tables (for structured output) are your go-to tools.
library(dplyr)
library(tidyr)
# Simulated demo data
demo_data <- data.frame(
SubjectID = 1:50,
Age = rnorm(50, mean = 45, sd = 10),
Sex = sample(c("Male", "Female"), 50, replace = TRUE),
Treatment = sample(c("Placebo", "Drug"), 50, replace = TRUE)
)
# Summary using dplyr
demo_summary <- demo_data %>%
group_by(Treatment, Sex) %>%
summarise(
N = n(),
Mean_Age = mean(Age, na.rm = TRUE),
SD_Age = sd(Age, na.rm = TRUE)
)
print(demo_summary)
gtFor polished tables (like SAS ODS), use gt:
library(gt)
demo_summary %>%
gt() %>%
tab_header(title = "Demographics Summary") %>%
fmt_number(columns = c(Mean_Age, SD_Age), decimals = 2) %>%
cols_label(
Mean_Age = "Mean Age",
SD_Age = "SD Age"
)
Key Functions:
- group_by(): Groups data (like BY in SAS).
- summarise(): Aggregates data (like PROC MEANS).
- gt(): Formats tables professionally.
Listings display raw data (e.g., adverse events, lab results). In R, DT (interactive tables) or knitr::kable (static tables) work well.
library(DT)
# Simulated AE data
ae_data <- data.frame(
SubjectID = sample(1:50, 30, replace = TRUE),
Term = sample(c("Headache", "Fatigue", "Nausea"), 30, replace = TRUE),
Severity = sample(c("Mild", "Moderate", "Severe"), 30, replace = TRUE)
)
# Interactive table
datatable(ae_data,
filter = "top",
options = list(pageLength = 5))
Tip: Use kable for static reports:
knitr::kable(ae_data, caption = "Adverse Events Listing")
R’s ggplot2 is the gold standard for graphs. Below are clinical trial examples.
library(ggplot2)
ae_data %>%
count(Term, Severity) %>%
ggplot(aes(x = Term, y = n, fill = Severity)) +
geom_bar(stat = "identity", position = "dodge") +
labs(title = "Adverse Events by Severity", x = "Term", y = "Count") +
theme_minimal()
Key Layers:
- aes(): Maps variables to aesthetics (like SAS SGPLOT’s x= and y=).
- geom_bar(): Creates bar charts.
- labs(): Adds labels.
library(survival)
library(survminer)
# Simulated survival data
surv_data <- data.frame(
Time = rnorm(50, mean = 365, sd = 100),
Event = sample(0:1, 50, replace = TRUE),
Treatment = sample(c("Placebo", "Drug"), 50, replace = TRUE)
)
fit <- survfit(Surv(Time, Event) ~ Treatment, data = surv_data)
ggsurvplot(fit, data = surv_data, risk.table = TRUE)
Comparison to SAS:
- survfit(): Equivalent to PROC LIFETEST.
- ggsurvplot(): Like PROC SGPLOT’s survival curves.
rtablesFor regulatory submissions, rtables (from Roche) provides a framework akin to SAS TLGs:
library(rtables)
lyt <- basic_table() %>%
split_cols_by("Treatment") %>%
analyze_vars("Age", format = "xx.xx")
tbl <- build_table(lyt, demo_data)
tbl
Why Use rtables?
- Reproducible, structured outputs.
- Alignment with CDISC standards.
dplyr for summaries, gt for formatting. Replace PROC PRINT with DT or kable.
Graphs:
ggplot2 replaces SGPLOT/GTL. survminer handles survival plots.
Regulatory-Grade Outputs:
rtables for structured TLGs. By mastering these tools, you’ll confidently transition from SAS to R for clinical reporting!
Next Steps:
- Practice with real-world datasets (e.g., ADaM-like data in R).
- Explore Tplyr (an R package mimicking SAS PROC TABULATE).
Happy coding! 🚀