Tests whether hazard function of different groups are equal.
Definition
Let:
- : ordered distinct event times (across all groups pooled)
- : number of distinct event times
- : number of groups (populations) being compared
- : number of events in group at time
- : number at risk in group just before
- : total events at time (pooled)
- : total at risk at time (pooled)
- : weight function applied at time (see Weight Functions)
Hypotheses
where is the end of study time (maximum follow-up).
Under , the expected hazard in each group is
Test Statistic
Variance and Covariance
Overall Test
Let be the estimated covariance matrix with entries on the diagonal and off-diagonal. Then:
For , the test simplifies to:
Weight Functions
| Test Name | Description | |
|---|---|---|
| Log-Rank | Equal weight to all event times | |
| Gehan / Breslow | Generalization of Mann-Whitney-Wilcoxon / Kruskal-Wallis | |
| Tarone-Ware | Intermediate weighting between log-rank and Gehan |
Interpretation
The log-rank test () is most powerful when hazard ratios are constant over time. Gehan’s test () gives more weight to early event times. Tarone-Ware () balances the two.
Example: Leukemia Remission Data
Case: Compare survival between two leukemia treatment groups — Group 1 (6-MP treatment) vs Group 2 (placebo).
# Group 1 (6-MP treatment)
time1 <- c(6, 6, 6, 7, 10, 13, 16, 22, 23, 6, 9, 10, 11, 17, 19, 20, 25, 32, 32, 34, 35)
status1 <- c(1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
# Group 2 (placebo)
time2 <- c(1, 1, 2, 2, 3, 4, 4, 5, 5, 8, 8, 8, 8, 11, 11, 12, 12, 15, 17, 22, 23)
status2 <- c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)Step 1 — KM curves:
library(survival)
fit1 <- survfit(Surv(time1, status1) ~ 1)
fit2 <- survfit(Surv(time2, status2) ~ 1)
plot(fit1, conf.int="none", col="blue",
xlab="Time (weeks)", ylab="Survival Probability")
lines(fit2, conf.int="none", col="red")
legend(19, 1, c("Treatment", "Placebo"), col=c("blue","red"), lty=1)Step 2 — Log-rank test:
time <- c(time1, time2)
status <- c(status1, status2)
treatment <- c(rep(1, length(time1)), rep(2, length(time2)))
fit <- survdiff(Surv(time, status) ~ treatment)
fit
# Output: N, Observed, Expected, (O-E)^2/E, (O-E)^2/V, Chisq, p-valueInterpretation: The log-rank test answers: “Do the two survival curves differ beyond random chance?” The KM plot provides visual comparison; the log-rank test provides statistical confirmation. A significant p-value means the treatment effect is statistically significant.
Interpretation
For survival analysis, the log-rank test () is most powerful under the proportional hazards assumption. If PH is violated, consider alternative weights (Gehan/Breslow) or methods.