Updating to {comorbidity} 1.0.0

The 1.0.0 release of the {comorbidity} package consists of a vast rewrite of the package API and internals, which unfortunately introduces a variety of breaking changes by modifying the behaviour of the user-facing comorbidity() function. This article describes some of the changes, including some examples using simulated data and comparisons with the previous API.

The simulated dataset (with 10 subjects and a total of 250 ICD codes) that we will be using throughout this article can be generated as follows:

library(comorbidity)

set.seed(2837465)

sim_data <- data.frame(
  id = sample(x = seq(10), size = 250, replace = TRUE),
  code = sample_diag(n = 250)
)
sim_data <- sim_data[order(sim_data$id), ]
str(sim_data)
#> 'data.frame':    250 obs. of  2 variables:
#>  $ id  : int  1 1 1 1 1 1 1 1 1 1 ...
#>  $ code: chr  "S667" "V596" "R874" "S26" ...

We use ICD-10 only (to simplify this document), but everything applies to ICD-9 as well (with the obvious adjustments).

Comorbidity Mapping and Scoring are Now Distinct Functions

With the previous release of {comorbidity}, the comorbidity mapping and scoring algorithms were applied jointly with a single function call:

comorbidity(x = sim_data, id = "id", code = "code", score = "charlson", icd = "icd10", assign0 = FALSE)
#>    id ami chf pvd cevd dementia copd rheumd pud mld diab diabwc hp rend canc
#> 1   1   0   0   0    0        0    0      0   0   0    0      0  1    0    1
#> 2   2   0   0   0    0        0    0      0   0   0    1      1  0    0    1
#> 3   3   0   0   0    0        0    0      0   1   0    0      0  0    0    0
#> 4   4   0   0   0    1        0    0      0   0   0    0      0  0    0    1
#> 5   5   0   0   0    0        0    0      0   0   0    1      0  0    0    0
#> 6   6   0   0   0    0        0    0      0   0   0    0      0  0    0    0
#> 7   7   0   0   0    0        0    0      0   0   0    0      0  0    1    1
#> 8   8   0   0   0    1        0    0      1   0   0    1      1  0    0    1
#> 9   9   0   0   0    0        0    0      0   0   0    0      0  0    0    1
#> 10 10   0   0   0    0        0    0      0   0   0    0      0  0    0    1
#>    msld metacanc aids score index wscore windex
#> 1     0        0    0     2   1-2      4    3-4
#> 2     0        0    0     3   3-4      5    >=5
#> 3     0        0    0     1   1-2      1    1-2
#> 4     0        0    0     2   1-2      3    3-4
#> 5     0        0    0     1   1-2      1    1-2
#> 6     0        0    0     0     0      0      0
#> 7     0        0    0     2   1-2      4    3-4
#> 8     0        0    0     5   >=5      7    >=5
#> 9     0        0    0     1   1-2      2    1-2
#> 10    0        0    0     1   1-2      2    1-2

Note that, as of {comorbidity} version 1.0.4, the ami condition has been renamed to mi. See #53 on GitHub for more details.

Now, we first need to apply the mapping algorithm:

com <- comorbidity(x = sim_data, id = "id", code = "code", map = "charlson_icd10_quan", assign0 = FALSE)
com
#>    id mi chf pvd cevd dementia cpd rheumd pud mld diab diabwc hp rend canc msld
#> 1   1  0   0   0    0        0   0      0   0   0    0      0  1    0    1    0
#> 2   2  0   0   0    0        0   0      0   0   0    1      1  0    0    1    0
#> 3   3  0   0   0    0        0   0      0   1   0    0      0  0    0    0    0
#> 4   4  0   0   0    1        0   0      0   0   0    0      0  0    0    1    0
#> 5   5  0   0   0    0        0   0      0   0   0    1      0  0    0    0    0
#> 6   6  0   0   0    0        0   0      0   0   0    0      0  0    0    0    0
#> 7   7  0   0   0    0        0   0      0   0   0    0      0  0    1    1    0
#> 8   8  0   0   0    1        0   0      1   0   0    1      1  0    0    1    0
#> 9   9  0   0   0    0        0   0      0   0   0    0      0  0    0    1    0
#> 10 10  0   0   0    0        0   0      0   0   0    0      0  0    0    1    0
#>    metacanc aids
#> 1         0    0
#> 2         0    0
#> 3         0    0
#> 4         0    0
#> 5         0    0
#> 6         0    0
#> 7         0    0
#> 8         0    0
#> 9         0    0
#> 10        0    0

…which yields the same results shown above.

Then, if we need to calculate comorbidity scores, we can use the score() function:

score(com, assign0 = FALSE, weights = NULL)
#>  [1] 2 3 1 2 1 0 2 5 1 1
#> attr(,"map")
#> [1] "charlson_icd10_quan"

This yields unweighted scores (e.g. equivalent to the score column above). If we need weighted scores (e.g. the wscore column above, which assumes the old Charlson score weights from 1987), we can pass the name of a supported scoring algorithm:

score(com, assign0 = FALSE, weights = "charlson")
#>  [1] 4 5 1 3 1 0 4 7 2 2
#> attr(,"map")
#> [1] "charlson_icd10_quan"
#> attr(,"weights")
#> [1] "charlson"

Once again, all the results from the score() function are equivalent to what we obtained using the previously-released version.

Supported Mapping and Scoring Algorithms

The new version includes updated comorbidity mapping and scoring algorithms. Furthermore, it is designed in such a way that should simplify the addition of new scores in the future. The currently supported comorbidity mapping algorithms are described in the following vignette:

vignette("02-comorbidityscores", package = "comorbidity")

Alternatively, a new function is provided to display supported algorithms in the R console:

available_algorithms()
#> Supported comorbidity mapping algorithms:
#>  * charlson_icd9_quan 
#>  * charlson_icd10_quan 
#>  * charlson_icd10_se 
#>  * charlson_icd10_am 
#>  * charlson_icd10_am_ucodes 
#>  * elixhauser_icd9_quan 
#>  * elixhauser_icd10_quan 
#> 
#> Supported scoring weights algorithms:
#>  * For charlson_icd9_quan: charlson, quan 
#>  * For charlson_icd10_quan: charlson, quan 
#>  * For charlson_icd10_se: charlson, quan 
#>  * For charlson_icd10_am: charlson, quan 
#>  * For charlson_icd10_am_ucodes: charlson, quan 
#>  * For elixhauser_icd9_quan: vw, swiss 
#>  * For elixhauser_icd10_quan: vw, swiss

This is picked up auto-magically from the internal data structures, so it should always be up-to-date.

Computational Speed

The internal re-writing of the package API allowed optimising code for speed and efficiency. We managed to estimate, using simulated data, that the main comorbidity mapping function (e.g. comorbidity()) should be approximately twice as fast as version 0.5.3, across a variety of sample sizes:

Plot highlighting the fold improvement in computational speed when running the version 1.0.0 of {comorbidity} compared to version 0.5.3.

This is the main computational bottleneck, as applying the scoring algorithm should be very fast in general.

Reverting to previous release

I understand that this new release might break some workflows, so apologies for that. If you have some feedback, please feel free to e-mail the maintainer of the package or to open an issue on GitHub, the latter being strongly suggested.

Finally, if required, you can revert to the previous release by installing from GitHub:

library(remotes)
remotes::install_github("ellessenne/comorbidity@0.5.3")

Alessandro Gasparini

2024-07-16

Comorbidity Mapping and Scoring are Now Distinct Functions

Supported Mapping and Scoring Algorithms

Computational Speed

Reverting to previous release