6.7 Conclusions

I showed that estimates of regression coefficients, frailty variance, and difference in expectation of life are relatively insensitive to misspecification of the frailty distribution of the model. Conversely, misspecifying the baseline hazard has serious consequences as it impacts both relative and absolute measures of risk, and estimates of heterogeneity. This seems to be particularly important with respect to the regression coefficients, as bias on the log-hazard ratio scale of up to 0.13 corresponds to a difference of approximately 15% on the relative risk scale, a clinically meaningful difference. All models seemed to produced biased estimates of the frailty variance, which may be due to the small number of cluster examined here; exploring additional scenarios will provide a greater insight on the topic. The bias in the difference of 5-years expectation of life seems to be less clinically relevant (bias up to 1.5 months), but it is something to bear in mind nonetheless. The fully parametric models perform well (as expected) when well specified, but relatively simple hazard forms may be too restrictive and unrealistic in practice; conversely, flexible parametric models showed robustness to all different shapes of the baseline hazards and generally performed best, even compared to the Cox model. Further to that, this robustness seemed to be independent of the number of knots for modelling the baseline hazard and on the estimation method (full or penalised likelihood).