Michael Sachs
Under the same setup as before, the additive hazard model is \[ \alpha(t | X_i) = \beta_0(t) + \beta_1(t) x_{i1}(t) + \cdots + \beta_p(t) x_{ip}(t), \] where the \(\beta_q(t)\) are regression functions and we define the cumulative regression functions \[ B_q(t) = \int_0^t\beta_q(s) \, ds. \] Using the fact that \(\lambda_i(t) = Y_i(t)\alpha_i(t)\) and the decomposition \[ dN_i(t) = \lambda_i(t)dt + dM_i(t), \] we now have \[ dN_i(t) = Y_i(t)B_0(t) + \sum_{j=1}^pY_i(t)x_{ij}(t)dB_{j}(t) + dM_i(t). \] This resembles an ordinary linear regression model.
Letting \(\mathbf{N}(t), \mathbf{B}(t)\) and \(\mathbf{M}(t)\) denote the column vectors you get by stacking the \(n\) observations, and the \(n \times (p+1)\) design matrix \[ \mathbf{X}(t) = \left(\begin{array}{c} Y_1(t) & Y_1(t) x_{11}(t) & \cdots & Y_1(t) x_{1p}(t) \\ \vdots & & & \\ Y_n(t) & Y_n(t) x_{n1}(t) & \cdots & Y_n(t) x_{np}(t) \end{array}\right). \] Then the model can be written \[ d\mathbf{N}(t) = \mathbf{X}(t) \, d\mathbf{B}(t) + d\mathbf{M}(t). \] If \(\mathbf{X}(t)\) is invertable, then \[ d\hat{\mathbf{B}}(t) = (\mathbf{X}(t)^\top\mathbf{X}(t))^{-1}\mathbf{X}(t)^\top \, d\mathbf{N}(t) = \mathbf{X}^{-}(t)\, d\mathbf{N}(t) \] and if we let \(J(s)\) be the indicator that \(\mathbf{X}(s)\) is invertable, then the cumulative coefficient estimate is \[ \hat{\mathbf{B}}(t) = \int_0^t J(u)\mathbf{X}^-(t)\, d\mathbf{N}(u). \]
Let \(\mathbf{B}^*(t) = \int_0^tJ(u)\,d\mathbf{B}(u)\), then we can study the bias term \[ \hat{\mathbf{B}}(t) - \mathbf{B}^*(t) = \int_0^t J(u)\mathbf{X}^-(u)\, d\mathbf{M}(u), \] arguing as before, that asymptotically, \(\mathbf{B}^*(t)\) is close enough to \(\mathbf{B}(t)\).
This is a stochastic integral with respect to a vector valued martingale. All of the properties we studied for univariate martingales have vector valued versions, including a multvariate martingale central limit theorem. One key distinction is the form of the predictable and optional variation processes:
\[ \langle \hat{\mathbf{B}} - \mathbf{B}^* \rangle (t) = \int_0^t J(u) \mathbf{X}^-(u) \mbox{diag}(\mathbf{\lambda}(u)\,du)\mathbf{X}^-(u)^\top, \] and \[ [\hat{\mathbf{B}} - \mathbf{B}^*] (t) = \int_0^t J(u) \mathbf{X}^-(u) \mbox{diag}(d\mathbf{N}(u))\mathbf{X}^-(u)^\top, \] where \(\mathbf{\lambda}(u)\) is the vector of intensity processes, and \(\mbox{diag}(v)\) is the diagonal matrix with \(v\) as the diagonal.
If \[ n \langle \hat{\mathbf{B}} - \mathbf{B}^* \rangle (t) \rightarrow \Sigma(t) \] then the martingale central limit theorem says that \(\sqrt{n}(\hat{\mathbf{B}} - \mathbf{B}^*)\) converges to a multivariate Gaussian martingale with covariance function \(\Sigma(t)\). \(\Sigma(t)\) can be estimated by
\[ \hat{\Sigma}(t) = [\hat{\mathbf{B}} - \mathbf{B}^*] (t) = \sum_{T_j \leq t}J(T_j) \mathbf{X}^-(T_j) \mbox{diag}(\Delta \mathbf{N}(T_j))\mathbf{X}^-(T_j)^\top, \] or, plugging in the estimates to the expression for the intensity function \[ \tilde{\Sigma}(t) = \widehat{\langle \hat{\mathbf{B}} - \mathbf{B}^*\rangle (t)} = \sum_{T_j \leq t}J(T_j) \mathbf{X}^-(T_j) \mbox{diag}(\mathbf{X}(T_j)\Delta \hat{\mathbf{B}}(T_j))\mathbf{X}^-(T_j)^\top. \]
The cumulative regression functions are usually plotted for each covariate, along with confidence intervals \[ \hat{B}_q(t) \pm z_{1 - \alpha/2}\sqrt{\Sigma_{(q,q)}(t)}. \] > One should focus on the slope of the estimated curve and judge how this changes over time.
Alternatively, one can use kernel smoothing to get a smoothed estimate of the derivative of the cumulative functions, i.e., a plot of \(\hat{\beta}_q(t)\).
Relative survival is interpreted in the context of an “excess hazard” model: \[ \alpha(t | X) = \alpha^*(t | X) + v(t, Z), \] where \(\alpha^*\) is called the “expected” hazard and \(v\) is the excess hazard due to the presence of some disease (usually cancer). \(\alpha^*\) is typically estimated using external data (e.g., life tables for a population reported by a national register).
In terms of survival, we have \[ S(t | X) = \exp(-\alpha(t|X)) = S^*(t | X) \exp(-v(t, Z)). \]
In an Aalen model, if the \(j\)th variable is dichotomous then \[ \exp(-\hat{B}_j(t)) \] can be interpreted as the adjusted relative survival function of covariate \(j\) “comparing the survival of individuals that are similar except for differences in one covariate”.
We may want to test the null hypothesis: \[ H_0: B_q(t) = 0, \mbox{ all } t \in [0, t_0]. \] Under the null, the test statistic \[ Z_q(t_0) = \int_0^{t_0}L_q(t) \, d\hat{B}_q(t) = \sum_{T_j \leq t_0} L_q(T_j)\Delta \hat{B}_q(T_j) \] is a stochastic integral for a predictable \(L_q(t)\) which is a weight process that equals 0 whenever \(J(t) = 0\). Hence the predictable variation \[ \langle Z_q \rangle(t_0) = \int_0^{t_0}L^2_q(t) \, d\langle \hat{B}_q \rangle(t). \] Replacing \(\langle \hat{B}_q \rangle(t)\) with the estimated version, we can standardize the test statistic and compare it to a standard normal.
The weight process \(L\) can be based on the diagonal of the matrix \[ K(t) = \left\{\mbox{diag}((X(t)^\top X(t))^{-1})\right\}^{-1}. \]
We have the Cox model, \[ \lambda_i(t) = Y_i(t)\lambda_0(t) \exp(X_i(t)^\top\beta(t)) \] and the Aalen additive model \[ \lambda_i(t) = Y_i(t)X_i^\top(t)\beta(t). \] Various attempts have been made to combine the two. Scheike and Zhang (2002) summarize those attempts and propose a new one \[ \lambda_i(t) = Y_i(t) X_i^\top(t)\alpha(t)\exp(Z_i(t)^\top\beta(t)), \] where the covariates of interest are in \(X_i\), while the \(Z_i\) are more flexibly modeled.
They derive the likelihood: \[ \prod_i \prod_{t \leq \tau} (Y_i(t)(X_i^\top(t)\,dA(t))\exp(Z_i(t)^\top\beta(t))^{dN_it(t)})\times \] \[ \exp\left(-\int_0^\tau Y_i(t)(X_i(t)^\top \alpha(t))\exp(Z_i(t)^\top\beta)\, ds\right) \] where \(A(t) = \int_0^t \alpha(s) ds\).
It is of interest to estimate both \(dA(t)\) (nonparametrically) and \(\beta\).
Using the timereg
package
library(timereg)
data(sTRACE)
# Fits stratified Cox model
out<-cox.aalen(Surv(time,status==9)~-1+factor(vf)+ prop(age)+prop(sex)+
prop(chf)+prop(diabetes),data=sTRACE,max.time=7,n.sim=100)
summary(out)
## Cox-Aalen Model
##
## Test for Aalen terms
## Test not computed, sim=0
##
## Proportional Cox terms :
## Coef. SE Robust SE D2log(L)^-1 z P-val lower2.5%
## prop(age) 0.0667 0.0075 0.00702 0.00724 9.50 0.00e+00 0.0520
## prop(sex) 0.3760 0.1400 0.13800 0.13700 2.74 6.22e-03 0.1020
## prop(chf) 0.7310 0.1420 0.13900 0.14200 5.25 1.53e-07 0.4530
## prop(diabetes) 0.4000 0.1960 0.18500 0.19400 2.17 3.01e-02 0.0158
## upper97.5%
## prop(age) 0.0814
## prop(sex) 0.6500
## prop(chf) 1.0100
## prop(diabetes) 0.7840
## Test of Proportionality
## sup| hat U(t) | p-value H_0
## prop(age) 147.00 0.26
## prop(sex) 4.42 0.80
## prop(chf) 7.21 0.20
## prop(diabetes) 3.76 0.56
## Cox-Aalen model
out<-cox.aalen(Surv(time,status==9)~prop(age)+prop(sex)+
vf+chf+prop(diabetes),data=sTRACE,max.time=7,n.sim=100)
summary(out)
## Cox-Aalen Model
##
## Test for Aalen terms
## Test not computed, sim=0
##
## Proportional Cox terms :
## Coef. SE Robust SE D2log(L)^-1 z P-val lower2.5%
## prop(age) 0.0653 0.00742 0.00693 0.00722 9.42 0.00000 0.0508
## prop(sex) 0.3620 0.13900 0.13600 0.13700 2.66 0.00774 0.0896
## prop(diabetes) 0.3950 0.19600 0.18500 0.19400 2.14 0.03240 0.0108
## upper97.5%
## prop(age) 0.0798
## prop(sex) 0.6340
## prop(diabetes) 0.7790
## Test of Proportionality
## sup| hat U(t) | p-value H_0
## prop(age) 123.00 0.41
## prop(sex) 4.16 0.82
## prop(diabetes) 3.80 0.55
Scheike, Zhang, and Gerds (2008) propose a model
\[
P(T_i \leq t | X_i) = h\{X_i^\top\eta(t)\}
\] where \(h\) is an inverse link function, and the coefficients may be time varying.
Under the assumption \[ E(\delta_i | X_i = x, T_i = t) = P(C_i > t | X_i = x) := G(t | x), \] they propose
This is very similar to inverse probability weighting for missing data.
The model is consistent no matter what the weight function is, but they are now working on deriving the efficient score for this model.
Also related is the pseudo-observation approach, which we will discuss in the special topics week, but that approach involves using all observations in the estimating equations, not just the uncensored ones.
From the R package mets
## Warning: package 'mets' was built under R version 4.1.1
## Loading required package: lava
## mets version 1.2.9
data(bmt)
# logistic regresion with IPCW binomial regression
out <- binreg(Event(time,cause)~tcell+platelet,bmt,time=50)
summary(out)
##
## n events
## 408 160
##
## 408 clusters
## coeffients:
## Estimate Std.Err 2.5% 97.5% P-value
## (Intercept) -0.178043 0.127210 -0.427370 0.071285 0.1616
## tcell -0.426984 0.348033 -1.109116 0.255147 0.2199
## platelet -0.441698 0.242422 -0.916837 0.033440 0.0685
##
## exp(coeffients):
## Estimate 2.5% 97.5%
## (Intercept) 0.83691 0.65222 1.0739
## tcell 0.65247 0.32985 1.2907
## platelet 0.64294 0.39978 1.0340
## pred se lower upper
## 1 0.3498404 0.04859196 0.2546001 0.4450806
## 2 0.2598545 0.07039433 0.1218816 0.3978273
For more of this sort of research, consider the book
Dynamic Regression Models for Survival Data (2006) by Torben Martinussen and Thomas H. Scheike.