3.2 Estimation process

Estimation of a joint model for longitudinal and survival data is a non-trivial task. The complexity of jointly modelling the longitudinal component and the survival component motivated the use of two-stages procedures as mentioned in Section 3. With that approach, the longitudinal component is modelled and estimated separately; consequently, subject-specific predictions from the longitudinal model are produced and plugged into the survival model as time-varying covariates. Despite the simplicity of this approach, though, it has been showed that it produces substantial bias and poor coverage (Tsiatis and Davidian 2001; Sweeting and Thompson 2011). Therefore, an approach that models both processes jointly is required. in particular, two approaches are predominant: a full likelihood approach, and a Bayesian approach; both have appealing characteristics, but they share the feature of being computationally intensive.

Focusing on the full likelihood approach, it is possible to formulate the joint likelihood (Rizopoulos 2012) for the overall parameter vector \(\theta = \{\theta_t, \theta_y, \theta_b\}\), formed by the parameters of the survival component, the parameters of the longitudinal component, and the elements of the variance-covariance matrix of the random effects, respectively. The joint distribution of the survival time \(T_i\), the event indicator \(d_i\), and the longitudinal response \(y_i\), conditional on the random effects \(b_i\), can be expressed as: \[ f(T_i, d_i, y_i | b_i, \theta) = f(T_i, d_i | b_i, \theta) f(y_i | b_i, \theta), \] with \[ f(y_i | b_i, \theta) = \prod_{j = 1} ^ {n_i} f(y_i(t_{ij}) | b_i, \theta). \] It is important to note that the survival process and the longitudinal process are assumed to be independent, conditionally on the random effects \(b_i\). It follows that the contribution to the log-likelihood for the \(i\)th patient is \[ \begin{aligned} \log L(\theta) &= \log \int_{-\infty} ^ {+\infty} f(T_i, d_i, y_i, b_i; \theta) \ db_i \\ &= \log \int_{-\infty} ^ {+\infty} f(T_i, d_i | b_i, \theta_t) \left[ \prod_{j = 1} ^ {n_i} f(y_i(t_{ij}) | b_i, \theta_y) \right] f(b_i | \theta_b) \ db_i \end{aligned} \] with \(f(T_i, d_i | b_i, \theta_t)\) the contribution to the likelihood relative to the survival component of the model: \[ \begin{aligned} f(T_i, d_i | b_i, \theta_t) &= h_i(T_i | M_i(T_i), \theta_t) ^ {d_i} S_i(T_i | M_i(T_i), \theta_t) \\ &= \left[ h_0(T_i) \exp(W \psi + \alpha m_i(T_i)) \right] ^ {d_i} \exp \left[ -\int_0^{T_i} h_0(u) \exp(W \psi + \alpha m_i(u)) \ du \right], \end{aligned} \] \(f(y_i(t_{ij}) | b_i, \theta_y)\) the contribution to the likelihood of the longitudinal process at time \(t_{ij}\): \[ f(y_i(t_{ij}) | b_i, \theta_y) = (2 \pi \sigma ^ 2) ^ {-1/2} \exp \left[ -\frac{(y_i(t_{ij}) - m_i(t_{ij})) ^ 2}{2 \sigma ^ 2} \right], \] and \(f(b_i | \theta_b)\) the density of the random effects: \[ f(b_i | \theta_b) = (2 \pi) ^ {-q_b / 2} | \Sigma | ^ {-1 / 2} \exp \left[- \frac{b_i^T \Sigma ^ {-1} b_i}{2}\right], \] with \(q_b\) being the dimension of the random effects.

Historically, the predominant method for maximising the full joint likelihood has been the Expectation-Maximisation algorithm (Dempster, Laird, and Rubin 1977); alternatively, it is possible to use general purpose optimisers to maximise the full joint likelihood via algorithms such as the Newton-Raphson algorithm. Nevertheless, significant computational challenges persist.

References

Tsiatis, Anastasios A, and Marie Davidian. 2001. “A Semiparametric Estimator for the Proportional Hazards Model with Longitudinal Covariates Measured with Error.” Biometrika 88 (2): 447–58. http://www.jstor.org/stable/2673492.

Sweeting, Michael J, and Simon G Thompson. 2011. “Joint Modelling of Longitudinal and Time-to-Event Data with Application to Predicting Abdominal Aortic Aneurysm Growth and Rupture.” Biometrical Journal 53 (5): 750–63. doi:10.1002/bimj.201100052.

Rizopoulos, Dimitris. 2012. Joint Models for Longitudinal and Time-to-Event Data: With Applications in R. Biostatistics. Chapman & Hall / CRC.

Dempster, Arthur P, Nan M Laird, and Donald B Rubin. 1977. “Maximum Likelihood from Incomplete Data via the EM Algorithm.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 39 (1): 1–38. http://www.jstor.org/stable/2984875.