Get regression standardized estimates from a glm
Usage
standardize_glm(
formula,
data,
values,
clusterid,
matched_density_cases,
matched_density_controls,
matching_variable,
p_population,
case_control = FALSE,
ci_level = 0.95,
ci_type = "plain",
contrasts = NULL,
family = "gaussian",
reference = NULL,
transforms = NULL
)
Arguments
- formula
The formula which is used to fit the model for the outcome.
- data
The data.
- values
A named list or data.frame specifying the variables and values at which marginal means of the outcome will be estimated.
- clusterid
An optional string containing the name of a cluster identification variable when data are clustered.
- matched_density_cases
A function of the matching variable. The probability (or density) of the matched variable among the cases.
- matched_density_controls
A function of the matching variable. The probability (or density) of the matched variable among the controls.
- matching_variable
The matching variable extracted from the data set.
- p_population
Specifies the incidence in the population when
case_control=TRUE
.- case_control
Whether the data comes from a case-control study.
- ci_level
Coverage probability of confidence intervals.
- ci_type
A string, indicating the type of confidence intervals. Either "plain", which gives untransformed intervals, or "log", which gives log-transformed intervals.
- contrasts
A vector of contrasts in the following format: If set to
"difference"
or"ratio"
, then \(\psi(x)-\psi(x_0)\) or \(\psi(x) / \psi(x_0)\) are constructed, where \(x_0\) is a reference level specified by thereference
argument. Has to beNULL
if no references are specified.- family
The family argument which is used to fit the glm model for the outcome.
- reference
A vector of reference levels in the following format: If
contrasts
is notNULL
, the desired reference level(s). This must be a vector or list the same length ascontrasts
, and if not named, it is assumed that the order is as specified in contrasts.- transforms
A vector of transforms in the following format: If set to
"log"
,"logit"
, or"odds"
, the standardized mean \(\theta(x)\) is transformed into \(\psi(x)=\log\{\theta(x)\}\), \(\psi(x)=\log[\theta(x)/\{1-\theta(x)\}]\), or \(\psi(x)=\theta(x)/\{1-\theta(x)\}\), respectively. If the vector isNULL
, then \(\psi(x)=\theta(x)\).
Value
An object of class std_glm
.
This is basically a list with components estimates and covariance estimates in res
.
Results for transformations, contrasts, references are stored in res_contrasts
.
Obtain numeric results in a data frame with the tidy function.
Details
standardize_glm
performs regression standardization
in generalized linear models,
at specified values of the exposure, over the sample covariate distribution.
Let \(Y\), \(X\), and \(Z\) be the outcome, the exposure, and a
vector of covariates, respectively.
standardize_glm
uses a fitted generalized linear
model to estimate the standardized
mean \(\theta(x)=E\{E(Y|X=x,Z)\}\),
where \(x\) is a specific value of \(X\),
and the outer expectation is over the marginal distribution of \(Z\).
References
Rothman K.J., Greenland S., Lash T.L. (2008). Modern Epidemiology, 3rd edition. Lippincott, Williams & Wilkins.
Sjölander A. (2016). Regression standardization with the R-package stdReg. European Journal of Epidemiology 31(6), 563-574.
Sjölander A. (2016). Estimation of causal effect measures with the R-package stdReg. European Journal of Epidemiology 33(9), 847-858.
Examples
# basic example
# needs to correctly specify the outcome model and no unmeasered confounders
# (+ standard causal assunmptions)
set.seed(6)
n <- 100
Z <- rnorm(n)
X <- rnorm(n, mean = Z)
Y <- rbinom(n, 1, prob = (1 + exp(X + Z))^(-1))
dd <- data.frame(Z, X, Y)
x <- standardize_glm(
formula = Y ~ X * Z,
family = "binomial",
data = dd,
values = list(X = 0:1),
contrasts = c("difference", "ratio"),
reference = 0
)
x
#> Outcome formula: Y ~ X * Z
#> <environment: 0x63e9486c66c0>
#> Outcome family: quasibinomial
#> Outcome link function: logit
#> Exposure: X
#>
#> Tables:
#> X Estimate Std.Error lower.0.95 upper.0.95
#> 1 0 0.519 0.0615 0.399 0.640
#> 2 1 0.391 0.0882 0.218 0.563
#>
#> Reference level: X = 0
#> Contrast: difference
#> X Estimate Std.Error lower.0.95 upper.0.95
#> 1 0 0.000 0.0000 0.000 0.00000
#> 2 1 -0.129 0.0638 -0.254 -0.00353
#>
#> Reference level: X = 0
#> Contrast: ratio
#> X Estimate Std.Error lower.0.95 upper.0.95
#> 1 0 1.000 0.000 1.000 1.000
#> 2 1 0.752 0.126 0.505 0.999
#>
# different transformations of causal effects
# example from Sjölander (2016) with case-control data
# here the matching variable needs to be passed as an argument
singapore <- AF::singapore
#> Registered S3 methods overwritten by 'stdReg':
#> method from
#> summary.parfrailty stdReg2
#> print.summary.parfrailty stdReg2
Mi <- singapore$Age
m <- mean(Mi)
s <- sd(Mi)
d <- 5
standardize_glm(
formula = Oesophagealcancer ~ (Everhotbev + Age + Dial + Samsu + Cigs)^2,
family = binomial, data = singapore,
values = list(Everhotbev = 0:1), clusterid = "Set",
case_control = TRUE,
matched_density_cases = function(x) dnorm(x, m, s),
matched_density_controls = function(x) dnorm(x, m - d, s),
matching_variable = Mi,
p_population = 19.3 / 100000
)
#> Warning: case_control = TRUE may not give reasonable results for the variance with clustering
#> Outcome formula: Oesophagealcancer ~ (Everhotbev + Age + Dial + Samsu + Cigs)^2
#> <environment: 0x63e9486c66c0>
#> Outcome family: quasibinomial
#> Outcome link function: logit
#> Exposure: Everhotbev
#>
#> Tables:
#> Everhotbev Estimate Std.Error lower.0.95 upper.0.95
#> 1 0 0.000128 1.69e-05 9.48e-05 0.000161
#> 2 1 0.000570 2.19e-04 1.41e-04 0.000999
#>
# multiple exposures
set.seed(7)
n <- 100
Z <- rnorm(n)
X1 <- rnorm(n, mean = Z)
X2 <- rnorm(n)
Y <- rbinom(n, 1, prob = (1 + exp(X1 + X2 + Z))^(-1))
dd <- data.frame(Z, X1, X2, Y)
x <- standardize_glm(
formula = Y ~ X1 + X2 + Z,
family = "binomial",
data = dd, values = list(X1 = 0:1, X2 = 0:1),
contrasts = c("difference", "ratio"),
reference = c(X1 = 0, X2 = 0)
)
x
#> Outcome formula: Y ~ X1 + X2 + Z
#> <environment: 0x63e9486c66c0>
#> Outcome family: quasibinomial
#> Outcome link function: logit
#> Exposure: X1, X2
#>
#> Tables:
#> X1 X2 Estimate Std.Error lower.0.95 upper.0.95
#> 1 0 0 0.419 0.0576 0.3058 0.532
#> 2 1 0 0.252 0.0740 0.1074 0.397
#> 3 0 1 0.273 0.0690 0.1382 0.409
#> 4 1 1 0.146 0.0631 0.0221 0.269
#>
#> Reference level: X1 = 0; X2 = 0
#> Contrast: difference
#> X1 X2 Estimate Std.Error lower.0.95 upper.0.95
#> 1 0 0 0.000 0.0000 0.000 0.0000
#> 2 1 0 -0.166 0.0590 -0.282 -0.0509
#> 3 0 1 -0.145 0.0458 -0.235 -0.0557
#> 4 1 1 -0.273 0.0581 -0.387 -0.1593
#>
#> Reference level: X1 = 0; X2 = 0
#> Contrast: ratio
#> X1 X2 Estimate Std.Error lower.0.95 upper.0.95
#> 1 0 0 1.000 0.000 1.000 1.000
#> 2 1 0 0.603 0.141 0.327 0.878
#> 3 0 1 0.653 0.114 0.430 0.876
#> 4 1 1 0.348 0.131 0.091 0.605
#>
tidy(x)
#> X1 X2 Estimate Std.Error lower.0.95 upper.0.95 contrast transform
#> 1 0 0 0.4187976 0.05763610 0.30583295 0.53176230 none identity
#> 2 1 0 0.2523918 0.07395458 0.10744346 0.39734008 none identity
#> 3 0 1 0.2733894 0.06899362 0.13816443 0.40861445 none identity
#> 4 1 1 0.1456809 0.06306346 0.02207877 0.26928300 none identity
#> 5 0 0 0.0000000 0.00000000 0.00000000 0.00000000 difference identity
#> 6 1 0 -0.1664059 0.05895745 -0.28196033 -0.05085138 difference identity
#> 7 0 1 -0.1454082 0.04579033 -0.23515558 -0.05566080 difference identity
#> 8 1 1 -0.2731167 0.05806770 -0.38692735 -0.15930614 difference identity
#> 9 0 0 1.0000000 0.00000000 1.00000000 1.00000000 ratio identity
#> 10 1 0 0.6026581 0.14070801 0.32687543 0.87844071 ratio identity
#> 11 0 1 0.6527961 0.11372894 0.42989144 0.87570068 ratio identity
#> 12 1 1 0.3478551 0.13106574 0.09097094 0.60473922 ratio identity
# continuous exposure
set.seed(2)
n <- 100
Z <- rnorm(n)
X <- rnorm(n, mean = Z)
Y <- rnorm(n, mean = X + Z + 0.1 * X^2)
dd <- data.frame(Z, X, Y)
x <- standardize_glm(
formula = Y ~ X * Z,
family = "gaussian",
data = dd,
values = list(X = seq(-1, 1, 0.1))
)
# plot standardized mean as a function of x
plot(x)
# plot standardized mean - standardized mean at x = 0 as a function of x
plot(x, contrast = "difference", reference = 0)