6.1 Aim

The de-facto standard method used in medical research when dealing with time to event data is the Cox proportional hazards model. It is best suited when relative risk estimates are the quantities of interest. However, often the focus is on absolute measures of risk: in that context, modelling the baseline hazard is necessary, and it can be achieved by using standard parametric survival models with a simple parametric distribution (such as the exponential, Weibull, or Gompertz distribution) or by using the flexible parametric modelling approach (Royston and Parmar 2002) to better capture the shape of complex hazard functions. The latter approach requires choosing the number of degrees of freedom for the spline term used to approximate the baseline hazard: in practice, sensitivity analyses and information criteria (AIC, BIC) have been used to select the best model. Recently, Rutherford, Crowther, and Lambert (2015) showed via simulation studies that, assuming a sufficient number of degrees of freedom is used, the approximated hazard function given by restricted cubic splines fit well for a number of complex hazard shapes and the hazard ratios estimation is insensitive to the correct specification of the baseline hazard. Moreover, it is common to encounter clustered survival data where the overall study population can be divided into heterogeneous clusters of homogeneous observations; examples are given in Chapter 2. As a consequence, survival times of individuals within a cluster are likely to be correlated and need to be analysed as such by including a random effect, e.g. a frailty term.

Flexible parametric survival models are a robust alternative to standard parametric survival models when the shape of the hazard function is complex; using a sufficient number of degrees of freedom, e.g. 2 or more, the spline-based approach is able to capture the underlying shape of the hazard function with minimal bias. AIC and BIC can guide the choice of the best fitting model, but they tend to agree to within 1 or 2 degrees of freedom in practice (Rutherford, Crowther, and Lambert 2015). Analogously, the impact of the choice of a particular parametric frailty distribution on the regression coefficients is minimal (Pickles and Crouchley 1995). Conversely, little is know about the impact of misspecifying the baseline hazard in survival models with frailty terms.

My aim with this work is to assess the impact of misspecifying the baseline hazard or the frailty distribution on the estimated regression coefficients, frailty variance, and absolute, marginal risk measures such as the integrated difference of survival curves and the survival difference at given time points. I will simulate data under a variety of data-generating mechanisms, and then compare a set of models that include the Cox model with frailties, fully parametric survival models with frailty, models with flexible baseline hazard, and models with flexible baseline hazard and a penalty for the complexity of the spline term.

References

Royston, Patrick, and Mahesh KB Parmar. 2002. “Flexible Parametric Proportional-Hazards and Proportional-Odds Models for Censored Survival Data, with Application to Prognostic Modelling and Estimation of Treatment Effects.” Statistics in Medicine 21 (15): 2175–97. doi:10.1002/sim.1203.

Rutherford, Mark J, Michael J Crowther, and Paul C Lambert. 2015. “The Use of Restricted Cubic Splines to Approximate Complex Hazard Functions in the Analysis of Time-to-Event Data: A Simulation Study.” Journal of Statistical Computation and Simulation 85 (4): 777–93. doi:10.1080/00949655.2013.845890.

Pickles, Andrew, and Robert Crouchley. 1995. “A Comparison of Frailty Models for Multivariate Survival Data.” Statistics in Medicine 14 (13): 1447–61. doi:10.1002/sim.4780141305.