9 Future research developments

In this Chapter I will present my plans for research during the second year of my PhD. I also include a Gantt chart laying out the projects planned for years 2 and 3 of my PhD Figure 9.1.

First of all, I plan to conclude the simulation study on the impact of model misspecification in survival models with frailty terms. Simulations for some scenarios are currently still running on the University high-performance computing (HPC) facilities at the time of writing, and should finish soon. Consequently, I will summarise relevant results and I will finish writing up the project into a paper that will be submitted to a journal for publication by the end of the year. Current potential target journals are the Biometrical Journal and Statistics in Medicine.

At the same time, I will be planning the next project on modelling the visiting process and investigating the impact of an informative visiting process on inference using longitudinal data originating from healthcare records. This project aims to shed light on how an informative visiting process affects the analysis of longitudinal data that are intermittently and irregularly measured and recorded, in order to provide practical advice to applied researchers. In practice, I will be simulating complex survival data along with one or more longitudinal biomarkers under a variety of biologically plausible data-generating mechanisms; for instance, I will vary:

  1. the underlying risk of event, in terms of magnitude and shape of the baseline hazard;

  2. the number and frequency of longitudinal measurements;

  3. features of the longitudinal process such as functional form over time;

  4. the strength of the informative observation process, i.e. the magnitude of the association between the underlying disease process and the observation process;

  5. the shape of the association between the observation process and the disease process, that is, the parametrisation that links the two processes (e.g. current value parametrisation, intercept and slope, cumulative effect, etc.).

Then, I will compare different analytical approaches proposed in the literature to tackle the problem, starting with simple approaches such as including the number of preceding measurements in the model as a proxy of disease severity moving onto more complex methods such as those introduced in Section 8. Some methods will be directly applicable using existing software, while others will require developing ad-hoc software if an existing implementation is not available in standard statistical software. Throughout this project we will be collaborating with Dr. Jessica Barrett from the MRC Biostatistic Unit in Cambridge to discuss factors that may affect the results of this study and what methods to include and compare. Once the planning phase is completed, I will code the simulation study and run it using the HPC facilities of the University. Finally, I plan to write up the project into a manuscript approximately at the end of my second year.

During my second year I will also work on two additional projects. First, an applied project in the area of cardiovascular epidemiology using joint models for multiple longitudinal biomarkers and survival data using CALIBER data (Denaxas et al. 2012). CALIBER includes a wide variety of biomarkers such as systolic and diastolic blood pressure, body-mass index, high- and low-density cholesterol, and so on. These biomarkers are likely to be correlated as they change over time, and they may improve cardiovascular risk prediction and clear up the complex relationship between changes in the biomarkers and the risk of adverse events. Specifically, I will select a cohort of individuals with stable angina and type 2 diabetes (for whom regular monitoring of blood pressure is recommended) and evaluate the association between multiple, longitudinally measured biomarkers and the risk of adverse coronary events, fatal and non-fatal. Second, I will be collaborating Dr. Michael Harhay from the Department of Epidemiology and Biostatistics at the University of Pennsylvania to develop a tutorial on using joint models for longitudinal and survival data in applied settings; specifically, we will focus on the setting of intensive care medicine and we will include practical examples using real world data from clinical trials.

The applied project using CALIBER data will inform further projects if time allows it. A possible project could consist in studying and developing discrimination and calibration tools to use with multiple longitudinal biomarkers. Such tools are really important, as it is fundamental to be able to discern whether the addition of longitudinal biomarkers improves predictions or not. This project would have wide reaching consequences, as it would provide guidance in the use of joint models for longitudinal and survival data with multiple longitudinal biomarkers when the aim is prediction.

Finally, an ongoing task will be the continuous development and expansion of the interactive tool for exploring results from simulation studies. Potential developments will include:

  • polishing the underlying engine used to computed summary statistics;

  • including more plots;

  • allowing custom faceted plots and tables comparing multiple factors at once.

I aim to produce a polished version of SiReX to publish on-line to a wider audience as soon as possible, and write a manuscript presenting the tool for submission to a journal such as the Journal of Statistical Software.

Gantt chart for current and future projects.

Figure 9.1: Gantt chart for current and future projects.

References

Denaxas, Spiros C, Julie George, Emily Herrett, Anoop D Shah, Dipak Kalra, Aroon D Hingorani, Mika Kivimaki, Adam D Timmis, Liam Smeeth, and Harry Hemingway. 2012. “Data Resource Profile: Cardiovascular Disease Research Using Linked Bespoke Studies and Electronic Health Records (CALIBER).” International Journal of Epidemiology 41: 1625–38. doi:10.1093/ije/dys188.