Skip to contents

Doubly robust estimation of the survival curves in two exposure groups and their difference using the method of Sjolander and Vansteelandt (2017) .

Usage

sjovan_survdiff(
  oformula = NULL,
  ofunc = "survreg",
  oarg = list(),
  cformula = NULL,
  cfunc = "survreg",
  carg = list(),
  eformula = NULL,
  earg = list(),
  method = "DR",
  times = NULL,
  rel.tol = .Machine$double.eps^0.1,
  jacobian.method = "simple",
  se.type = "none",
  data = NULL,
  weights = NULL,
  subset = NULL,
  R = 50,
  parallel = "no",
  cl = NULL,
  ncpus = NULL
)

Arguments

oformula

The outcome formula, the left side should be a call to Surv

ofunc

The model type for the outcome, one of "coxph" or "survreg"

oarg

Arguments passed to ofunc

cformula

The censoring model formula, the left side should be a call to Surv

cfunc

The model type for the censoring model, either "coxph" or "survreg"

carg

Arguments passed to cfunc

eformula

The exposure model formula, which will be fit by logistic regression in a call to glm

earg

Arguments passed to the glm for the exposure model

method

Estimation method, either "IPW" or "DR"

times

Vector of times at which to estimate the survival curves

rel.tol

Convergence tolerance

jacobian.method

Method of computing the jacobian, passed to jacobian

se.type

Method of calculating standard errors, either "none" for no standard errors, "sandwich", or "boot"

data

Data frame in which to find the variables in the model formulas

weights

Vector of case weights

subset

Logical vector

R

Number of bootstrap replicates, used if se.type = "boot"

parallel

Parallel processing for bootstrap, see boot

cl

Optional cluster if parallel processing, see boot

ncpus

Number of cpus to use for parallel processing, see boot

Value

A list with the estimates survival probabilities in each group, their difference, and standard errors, if requested

Details

Following Bai et al. 2013 and Sjölander and Vansteelandt 2017, consider the following estimating equation for \(S_x(t) = p\{T(x) > t\}\): $$\sum_{i = 1}^n \left[S_x(t) - \frac{I_{X_i = x} I_{\tilde{T}_i > t}}{\bar{g}(\boldsymbol{Z}_i, \boldsymbol{\alpha})G_c(t, X_i, \boldsymbol{Z}_i)} - \frac{I_{X_i = x} - \bar{g}(\boldsymbol{Z}_i, \boldsymbol{\alpha})}{\bar{g}(\boldsymbol{Z}_i, \boldsymbol{\alpha})} H(t, \boldsymbol{Z}_i, X = x) - \right. \\ \left. \frac{I_{X_i = x} - \bar{g}(\boldsymbol{Z}_i, \boldsymbol{\alpha})}{\bar{g}(\boldsymbol{Z}_i, \boldsymbol{\alpha})} H(t, \boldsymbol{Z}_i, X = x) \int_0^t\frac{d\, M_c(u, \boldsymbol{Z}_i, X_i, \tilde{T}_i, \Delta_i)}{G_c(u, X_i, \boldsymbol{Z}_i)H(u, \boldsymbol{Z}_i, X_i)}\right] = 0, $$ where \(\bar{g}(\boldsymbol{Z}_i, \boldsymbol{\alpha}) = g(\boldsymbol{Z}_i, \boldsymbol{\alpha})^x (1 - g(\boldsymbol{Z}_i, \boldsymbol{\alpha}))^{(1-x)}\), \(H(t, \boldsymbol{Z}, X)\) is a model for \(p\{T > t | \boldsymbol{Z}, X\}\), and \(M_c(t, \boldsymbol{Z}, X, \tilde{T}, \Delta)\) is the martingale increment for the censoring distribution. The above is an unbiased estimating equation for \(S_x(t)\) if either \(H(t, \boldsymbol{Z}, X)\) is correctly specified for both censoring and confounding, i.e., is a correctly specified model for \(p\{T(x) > t | \boldsymbol{Z}, X\}\) or both \(G_c(u, X, \boldsymbol{Z})\) and \(g(\boldsymbol{Z}, \boldsymbol{\alpha})\) are correctly specified for censoring and confounding, respectively. To obtain estimates of the difference in survival probabilities, one must specify models for the unknown functions \(g\), \(G_c\), and \(H\), get estimates of those, plug them into the estimating equations, and solve for \(S_x(t)\) under \(x \in \{0, 1\}\). In this package, one can use semiparametric Cox models or parametric survival models for the outcome and the censoring distributions, and logistic regression for the propensity score model. Bai et al. 2013 provide an expression for a variance estimator that accounts for the uncertainty due to the estimation of the propensity score \(g\) and the censoring distribution \(G_c\).

References

Xiaofei Bai, Anastasios A Tsiatis, and Sean~M O'Brien. "Doubly-robust estimators of treatment-specific survival distributions in observational studies with stratified sampling." Biometrics, 69(4):830--839, 2013.

Sjölander, Arvid, and Stijn Vansteelandt. "Doubly robust estimation of attributable fractions in survival analysis." Statistical methods in medical research 26(2):948-969, 2017.

Examples

df <- rotterdam
df$time <- pmin(df$rtime, df$dtime) / 365.25
df$status <- ifelse(df$recur == 1 | df$death == 1, 1, 0)
df$censor <- 1 - df$status
drFit <-
  sjovan_survdiff(
    oformula = Surv(time, status) ~ chemo + year + age + meno +
      size + factor(grade) + nodes + pgr + er + hormon,
    ofunc = "survreg",
    cformula = Surv(time, censor) ~ chemo + year + age,
    cfunc = "survreg",
    eformula = chemo ~ year + age + meno + size +
      factor(grade) + nodes + pgr + er + hormon,
    method = "DR",
    times = c(2.5, 5, 7.5),
    se.type = "sandwich",
    data = df
  )
drFit
#> $est.S1
#> [1] 0.7661741 0.6032550 0.5224925
#> 
#> $se.S1
#> [1] 0.03440273 0.02906199 0.14520126
#> 
#> $est.S0
#> [1] 0.7190130 0.5515392 0.4670844
#> 
#> $se.S0
#> [1] 0.07008758 0.07427899 0.10500988
#> 
#> $est.diff
#> [1] 0.04716109 0.05171581 0.05540801
#> 
#> $se.diff
#> [1] 0.06863352 0.07611256 0.14153032
#>