Doubly robust estimation of the survival curves in two exposure groups and their difference using the method of Sjolander and Vansteelandt (2017) .
Usage
sjovan_survdiff(
oformula = NULL,
ofunc = "survreg",
oarg = list(),
cformula = NULL,
cfunc = "survreg",
carg = list(),
eformula = NULL,
earg = list(),
method = "DR",
times = NULL,
rel.tol = .Machine$double.eps^0.1,
jacobian.method = "simple",
se.type = "none",
data = NULL,
weights = NULL,
subset = NULL,
R = 50,
parallel = "no",
cl = NULL,
ncpus = NULL
)
Arguments
- oformula
The outcome formula, the left side should be a call to Surv
- ofunc
The model type for the outcome, one of "coxph" or "survreg"
- oarg
Arguments passed to
ofunc
- cformula
The censoring model formula, the left side should be a call to Surv
- cfunc
The model type for the censoring model, either "coxph" or "survreg"
- carg
Arguments passed to
cfunc
- eformula
The exposure model formula, which will be fit by logistic regression in a call to glm
- earg
Arguments passed to the glm for the exposure model
- method
Estimation method, either "IPW" or "DR"
- times
Vector of times at which to estimate the survival curves
- rel.tol
Convergence tolerance
- jacobian.method
Method of computing the jacobian, passed to jacobian
- se.type
Method of calculating standard errors, either "none" for no standard errors, "sandwich", or "boot"
- data
Data frame in which to find the variables in the model formulas
- weights
Vector of case weights
- subset
Logical vector
- R
Number of bootstrap replicates, used if
se.type = "boot"
- parallel
Parallel processing for bootstrap, see boot
- cl
Optional cluster if parallel processing, see boot
- ncpus
Number of cpus to use for parallel processing, see boot
Value
A list with the estimates survival probabilities in each group, their difference, and standard errors, if requested
Details
Following Bai et al. 2013 and Sjölander and Vansteelandt 2017, consider the following estimating equation for \(S_x(t) = p\{T(x) > t\}\): $$\sum_{i = 1}^n \left[S_x(t) - \frac{I_{X_i = x} I_{\tilde{T}_i > t}}{\bar{g}(\boldsymbol{Z}_i, \boldsymbol{\alpha})G_c(t, X_i, \boldsymbol{Z}_i)} - \frac{I_{X_i = x} - \bar{g}(\boldsymbol{Z}_i, \boldsymbol{\alpha})}{\bar{g}(\boldsymbol{Z}_i, \boldsymbol{\alpha})} H(t, \boldsymbol{Z}_i, X = x) - \right. \\ \left. \frac{I_{X_i = x} - \bar{g}(\boldsymbol{Z}_i, \boldsymbol{\alpha})}{\bar{g}(\boldsymbol{Z}_i, \boldsymbol{\alpha})} H(t, \boldsymbol{Z}_i, X = x) \int_0^t\frac{d\, M_c(u, \boldsymbol{Z}_i, X_i, \tilde{T}_i, \Delta_i)}{G_c(u, X_i, \boldsymbol{Z}_i)H(u, \boldsymbol{Z}_i, X_i)}\right] = 0, $$ where \(\bar{g}(\boldsymbol{Z}_i, \boldsymbol{\alpha}) = g(\boldsymbol{Z}_i, \boldsymbol{\alpha})^x (1 - g(\boldsymbol{Z}_i, \boldsymbol{\alpha}))^{(1-x)}\), \(H(t, \boldsymbol{Z}, X)\) is a model for \(p\{T > t | \boldsymbol{Z}, X\}\), and \(M_c(t, \boldsymbol{Z}, X, \tilde{T}, \Delta)\) is the martingale increment for the censoring distribution. The above is an unbiased estimating equation for \(S_x(t)\) if either \(H(t, \boldsymbol{Z}, X)\) is correctly specified for both censoring and confounding, i.e., is a correctly specified model for \(p\{T(x) > t | \boldsymbol{Z}, X\}\) or both \(G_c(u, X, \boldsymbol{Z})\) and \(g(\boldsymbol{Z}, \boldsymbol{\alpha})\) are correctly specified for censoring and confounding, respectively. To obtain estimates of the difference in survival probabilities, one must specify models for the unknown functions \(g\), \(G_c\), and \(H\), get estimates of those, plug them into the estimating equations, and solve for \(S_x(t)\) under \(x \in \{0, 1\}\). In this package, one can use semiparametric Cox models or parametric survival models for the outcome and the censoring distributions, and logistic regression for the propensity score model. Bai et al. 2013 provide an expression for a variance estimator that accounts for the uncertainty due to the estimation of the propensity score \(g\) and the censoring distribution \(G_c\).
References
Xiaofei Bai, Anastasios A Tsiatis, and Sean~M O'Brien. "Doubly-robust estimators of treatment-specific survival distributions in observational studies with stratified sampling." Biometrics, 69(4):830--839, 2013.
Sjölander, Arvid, and Stijn Vansteelandt. "Doubly robust estimation of attributable fractions in survival analysis." Statistical methods in medical research 26(2):948-969, 2017.
Examples
df <- rotterdam
df$time <- pmin(df$rtime, df$dtime) / 365.25
df$status <- ifelse(df$recur == 1 | df$death == 1, 1, 0)
df$censor <- 1 - df$status
drFit <-
sjovan_survdiff(
oformula = Surv(time, status) ~ chemo + year + age + meno +
size + factor(grade) + nodes + pgr + er + hormon,
ofunc = "survreg",
cformula = Surv(time, censor) ~ chemo + year + age,
cfunc = "survreg",
eformula = chemo ~ year + age + meno + size +
factor(grade) + nodes + pgr + er + hormon,
method = "DR",
times = c(2.5, 5, 7.5),
se.type = "sandwich",
data = df
)
drFit
#> $est.S1
#> [1] 0.7661741 0.6032550 0.5224925
#>
#> $se.S1
#> [1] 0.03440273 0.02906199 0.14520126
#>
#> $est.S0
#> [1] 0.7190130 0.5515392 0.4670844
#>
#> $se.S0
#> [1] 0.07008758 0.07427899 0.10500988
#>
#> $est.diff
#> [1] 0.04716109 0.05171581 0.05540801
#>
#> $se.diff
#> [1] 0.06863352 0.07611256 0.14153032
#>