Updating to {comorbidity} 1.0.0
Alessandro Gasparini
2024-07-16
Source:vignettes/C-changes.Rmd
C-changes.Rmd
The 1.0.0 release of the {comorbidity} package consists of a vast
rewrite of the package API and internals, which unfortunately introduces
a variety of breaking changes by modifying the behaviour of the
user-facing comorbidity()
function. This article describes
some of the changes, including some examples using simulated data and
comparisons with the previous API.
The simulated dataset (with 10 subjects and a total of 250 ICD codes) that we will be using throughout this article can be generated as follows:
library(comorbidity)
set.seed(2837465)
sim_data <- data.frame(
id = sample(x = seq(10), size = 250, replace = TRUE),
code = sample_diag(n = 250)
)
sim_data <- sim_data[order(sim_data$id), ]
str(sim_data)
#> 'data.frame': 250 obs. of 2 variables:
#> $ id : int 1 1 1 1 1 1 1 1 1 1 ...
#> $ code: chr "S667" "V596" "R874" "S26" ...
We use ICD-10 only (to simplify this document), but everything applies to ICD-9 as well (with the obvious adjustments).
Comorbidity Mapping and Scoring are Now Distinct Functions
With the previous release of {comorbidity}, the comorbidity mapping and scoring algorithms were applied jointly with a single function call:
comorbidity(x = sim_data, id = "id", code = "code", score = "charlson", icd = "icd10", assign0 = FALSE)
#> id ami chf pvd cevd dementia copd rheumd pud mld diab diabwc hp rend canc
#> 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1
#> 2 2 0 0 0 0 0 0 0 0 0 1 1 0 0 1
#> 3 3 0 0 0 0 0 0 0 1 0 0 0 0 0 0
#> 4 4 0 0 0 1 0 0 0 0 0 0 0 0 0 1
#> 5 5 0 0 0 0 0 0 0 0 0 1 0 0 0 0
#> 6 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0
#> 7 7 0 0 0 0 0 0 0 0 0 0 0 0 1 1
#> 8 8 0 0 0 1 0 0 1 0 0 1 1 0 0 1
#> 9 9 0 0 0 0 0 0 0 0 0 0 0 0 0 1
#> 10 10 0 0 0 0 0 0 0 0 0 0 0 0 0 1
#> msld metacanc aids score index wscore windex
#> 1 0 0 0 2 1-2 4 3-4
#> 2 0 0 0 3 3-4 5 >=5
#> 3 0 0 0 1 1-2 1 1-2
#> 4 0 0 0 2 1-2 3 3-4
#> 5 0 0 0 1 1-2 1 1-2
#> 6 0 0 0 0 0 0 0
#> 7 0 0 0 2 1-2 4 3-4
#> 8 0 0 0 5 >=5 7 >=5
#> 9 0 0 0 1 1-2 2 1-2
#> 10 0 0 0 1 1-2 2 1-2
Note that, as of {comorbidity} version 1.0.4, the ami
condition has been renamed to mi
. See #53 on
GitHub for more details.
Now, we first need to apply the mapping algorithm:
com <- comorbidity(x = sim_data, id = "id", code = "code", map = "charlson_icd10_quan", assign0 = FALSE)
com
#> id mi chf pvd cevd dementia cpd rheumd pud mld diab diabwc hp rend canc msld
#> 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0
#> 2 2 0 0 0 0 0 0 0 0 0 1 1 0 0 1 0
#> 3 3 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
#> 4 4 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0
#> 5 5 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
#> 6 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
#> 7 7 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0
#> 8 8 0 0 0 1 0 0 1 0 0 1 1 0 0 1 0
#> 9 9 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
#> 10 10 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
#> metacanc aids
#> 1 0 0
#> 2 0 0
#> 3 0 0
#> 4 0 0
#> 5 0 0
#> 6 0 0
#> 7 0 0
#> 8 0 0
#> 9 0 0
#> 10 0 0
…which yields the same results shown above.
Then, if we need to calculate comorbidity scores, we can use the
score()
function:
score(com, assign0 = FALSE, weights = NULL)
#> [1] 2 3 1 2 1 0 2 5 1 1
#> attr(,"map")
#> [1] "charlson_icd10_quan"
This yields unweighted scores (e.g. equivalent to the
score
column above). If we need weighted scores (e.g. the
wscore
column above, which assumes the old Charlson score
weights from 1987), we can pass the name of a supported scoring
algorithm:
score(com, assign0 = FALSE, weights = "charlson")
#> [1] 4 5 1 3 1 0 4 7 2 2
#> attr(,"map")
#> [1] "charlson_icd10_quan"
#> attr(,"weights")
#> [1] "charlson"
Once again, all the results from the score()
function
are equivalent to what we obtained using the previously-released
version.
Supported Mapping and Scoring Algorithms
The new version includes updated comorbidity mapping and scoring algorithms. Furthermore, it is designed in such a way that should simplify the addition of new scores in the future. The currently supported comorbidity mapping algorithms are described in the following vignette:
vignette("02-comorbidityscores", package = "comorbidity")
Alternatively, a new function is provided to display supported algorithms in the R console:
available_algorithms()
#> Supported comorbidity mapping algorithms:
#> * charlson_icd9_quan
#> * charlson_icd10_quan
#> * charlson_icd10_se
#> * charlson_icd10_am
#> * charlson_icd10_am_ucodes
#> * elixhauser_icd9_quan
#> * elixhauser_icd10_quan
#>
#> Supported scoring weights algorithms:
#> * For charlson_icd9_quan: charlson, quan
#> * For charlson_icd10_quan: charlson, quan
#> * For charlson_icd10_se: charlson, quan
#> * For charlson_icd10_am: charlson, quan
#> * For charlson_icd10_am_ucodes: charlson, quan
#> * For elixhauser_icd9_quan: vw, swiss
#> * For elixhauser_icd10_quan: vw, swiss
This is picked up auto-magically from the internal data structures, so it should always be up-to-date.
Computational Speed
The internal re-writing of the package API allowed optimising code
for speed and efficiency. We managed to estimate, using simulated data,
that the main comorbidity mapping function
(e.g. comorbidity()
) should be approximately twice as fast
as version 0.5.3, across a variety of sample sizes:
This is the main computational bottleneck, as applying the scoring algorithm should be very fast in general.
Reverting to previous release
I understand that this new release might break some workflows, so apologies for that. If you have some feedback, please feel free to e-mail the maintainer of the package or to open an issue on GitHub, the latter being strongly suggested.
Finally, if required, you can revert to the previous release by installing from GitHub:
library(remotes)
remotes::install_github("ellessenne/comorbidity@0.5.3")