Tests whether hazard function of different groups are equal.

Definition

Let:

  • : ordered distinct event times (across all groups pooled)
  • : number of distinct event times
  • : number of groups (populations) being compared
  • : number of events in group at time
  • : number at risk in group just before
  • : total events at time (pooled)
  • : total at risk at time (pooled)
  • : weight function applied at time (see Weight Functions)

Hypotheses

where is the end of study time (maximum follow-up).

Under , the expected hazard in each group is

Test Statistic

Variance and Covariance

Overall Test

Let be the estimated covariance matrix with entries on the diagonal and off-diagonal. Then:

For , the test simplifies to:

Weight Functions

Test NameDescription
Log-RankEqual weight to all event times
Gehan / BreslowGeneralization of Mann-Whitney-Wilcoxon / Kruskal-Wallis
Tarone-WareIntermediate weighting between log-rank and Gehan

Interpretation

The log-rank test () is most powerful when hazard ratios are constant over time. Gehan’s test () gives more weight to early event times. Tarone-Ware () balances the two.

Example: Leukemia Remission Data

Case: Compare survival between two leukemia treatment groups — Group 1 (6-MP treatment) vs Group 2 (placebo).

# Group 1 (6-MP treatment)
time1  <- c(6, 6, 6, 7, 10, 13, 16, 22, 23, 6, 9, 10, 11, 17, 19, 20, 25, 32, 32, 34, 35)
status1 <- c(1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
 
# Group 2 (placebo)
time2  <- c(1, 1, 2, 2, 3, 4, 4, 5, 5, 8, 8, 8, 8, 11, 11, 12, 12, 15, 17, 22, 23)
status2 <- c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)

Step 1 — KM curves:

library(survival)
 
fit1 <- survfit(Surv(time1, status1) ~ 1)
fit2 <- survfit(Surv(time2, status2) ~ 1)
 
plot(fit1, conf.int="none", col="blue",
     xlab="Time (weeks)", ylab="Survival Probability")
lines(fit2, conf.int="none", col="red")
legend(19, 1, c("Treatment", "Placebo"), col=c("blue","red"), lty=1)

Step 2 — Log-rank test:

time      <- c(time1, time2)
status    <- c(status1, status2)
treatment <- c(rep(1, length(time1)), rep(2, length(time2)))
 
fit <- survdiff(Surv(time, status) ~ treatment)
fit
# Output: N, Observed, Expected, (O-E)^2/E, (O-E)^2/V, Chisq, p-value

Interpretation: The log-rank test answers: “Do the two survival curves differ beyond random chance?” The KM plot provides visual comparison; the log-rank test provides statistical confirmation. A significant p-value means the treatment effect is statistically significant.

Interpretation

For survival analysis, the log-rank test () is most powerful under the proportional hazards assumption. If PH is violated, consider alternative weights (Gehan/Breslow) or methods.