Quantile g-computation (qgcomp) is a special case of g-computation
used for estimating joint exposure response curves for a set of
continuous exposures. The base package qgcomp
allows one to
estimate conditional or marginal joint-exposure response curves. Because
this approach developed within the field of “exposure mixtures” the set
of exposures of interest are referred to here as “the mixture.”
qgcompint
builds on qgcomp
by incorporating
statistical interaction (product terms) between binary, categorical, or
continuous covariates and the mixture.
Say we have an outcome Y, some exposures 𝕏, a “modifier” or a covariate for which we wish to assess statistical interaction with 𝕏, denoted by M and possibly some other covariates (e.g. potential confounders) denoted by ℤ.
The basic model of quantile g-computation is a joint marginal structural model given by
𝔼(YXq||M, Z, ψ, η) = g(ψ0 + ψ1Sq + ψ2M + ψ3M × Sq + ηZ)
where g(⋅) is a link function in a generalized linear model (e.g. the inverse logit function in the case of a logistic model for the probability that Y = 1), ψ0 is the model intercept, η is a set of model coefficients for the covariates and Sq is an “index” that represents a joint value of exposures. The joint exposure has a “main effect” at the referent value of M given by ψ1, ψ2 represents the association (or set of associations for categorical M) between the modifier and the outcome, and ψ3 is a product terms (or set of product terms for categorical M) that represent the deviation of the exposure response for Sq from the main effect for each one unit increase in M. The magnitude of ψ3 can be used to estimate the extent of statistical interaction on the model scale, sometimes referred to as effect measure modification.
Quantile g-computation (by default) transforms all exposures X into Xq, which are “scores” taking on discrete values 0, 1, 2, etc. representing a categorical “bin” of exposure. By default, there are four bins with evenly spaced quantile cutpoints for each exposure, so Xq = 0 means that X was below the observed 25th percentile for that exposure. The index Sq represents all exposures being set to the same value (again, by default, discrete values 0, 1, 2, 3). Thus, the parameter ψ1 quantifies the expected change in the outcome, given a one quantile increase in all exposures simultaneously, possibly adjusted for Z.
There are nuances to this particular model form that are available in
the qgcompint
package which will be explored below. There
exists one special case of quantile g-computation that leads to fast
fitting: linear/additive exposure effects. Here we simulate
“pre-quantized” data where the exposures X1, X2, ..., X7
can only take on values of 0, 1, 2, 3 in equal proportions. The model
underlying the outcomes is given by the linear regression:
𝔼(Y||X, M, β, ψ, η) = β0 + β1X1 + ... + β7X7 + ψ2M + η9X1 × M, ..., +η15X7 × M
with the true values of β given by:
## value
## psi_0 0.0
## beta_1 0.8
## beta_2 0.6
## beta_3 0.3
## beta_4 -0.3
## beta_5 -0.3
## beta_6 -0.3
## beta_7 0.0
## psi_2 0.0
## eta_1 1.0
## eta_2 0.0
## eta_3 0.0
## eta_4 0.0
## eta_5 0.2
## eta_6 0.2
## eta_7 0.2
In this example X1 is positively
correlated with X2 − X4
(ρ = 0.8, 0.6, 0.3) and
negatively correlated with X5 − X7
(ρ = −0.3 − 0.3, −0.3). In
this setting, the parameter ψ1 will equal the sum of
the β1 − β7
coefficients (0.8), ψ2 is given directly by
the data generation model (0.0), and ψ3 will equal the sum of
the η coefficients (1.6).
Simulating data to fit this model is available within a the
simdata_quantized_emm
function in the qgcompint package.
Here, we simulate data using a binary modifier and inspect the
correlation matrix to see that the estimated correlation matrix is
approximately the same as the correlation of the data generation
mechanism. These will converge in large sample sizes, but for a sample
size of 200, the estimated coefficients will differ from those we
simulate under due to random variation (set the sample size to 10000 in
this example to confirm).
library(qgcompint)
set.seed(42)
dat1 <- simdata_quantized_emm(
outcometype="continuous",
# sample size
n = 300,
# correlation between x1 and x2, x3, ...
corr=c(0.8, 0.6, 0.3, -0.3, -0.3, -0.3),
# model intercept
b0=0,
# linear model coefficients for x1, x2, ... at referent level of interacting variable
mainterms=c(0.3, -0.1, 0.1, 0.0, 0.3, 0.1, 0.1),
# linear model coefficients for product terms between x1, x2, ... and interacting variable
prodterms = c(1.0, 0.0, 0.0, 0.0, 0.2, 0.2, 0.2),
# type of interacting variable
ztype = "binary",
# number of levels of exposure
q = 4,
# residual variance of y
yscale = 2.0
)
names(dat1)[which(names(dat1)=="z")] = "M"
## data
head(dat1)
## M x1 x2 x3 x4 x5 x6 x7 y
## 1 1 0 1 0 0 2 3 3 3.231033
## 2 1 0 3 0 0 3 2 1 1.685041
## 3 0 1 1 1 0 2 1 2 2.459104
## 4 1 2 2 0 2 3 1 1 4.750329
## 5 1 1 1 3 1 3 1 3 2.756988
## 6 1 1 0 1 1 3 0 2 4.711521
##
## 0 1
## 151 149
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -4.9302 0.7028 2.4380 2.5628 4.2134 10.0813
## x1 x2 x3 x4 x5 x6
## x1 1.0000000 0.8240000 0.6213333 0.33866667 -0.34666667 -0.28266667
## x2 0.8240000 1.0000000 0.4800000 0.33600000 -0.32533333 -0.23200000
## x3 0.6213333 0.4800000 1.0000000 0.26133333 -0.20800000 -0.23733333
## x4 0.3386667 0.3360000 0.2613333 1.00000000 -0.12800000 -0.03200000
## x5 -0.3466667 -0.3253333 -0.2080000 -0.12800000 1.00000000 0.09066667
## x6 -0.2826667 -0.2320000 -0.2373333 -0.03200000 0.09066667 1.00000000
## x7 -0.3653333 -0.3626667 -0.2346667 -0.04533333 0.16800000 0.12000000
## x7
## x1 -0.36533333
## x2 -0.36266667
## x3 -0.23466667
## x4 -0.04533333
## x5 0.16800000
## x6 0.12000000
## x7 1.00000000
Here we see that qgcomp (via the function
qgcomp.emm.glm.noboot
) estimates a ψ1 fairly close to 0.8
(estimate = 0.7) (again, as we increase sample size, the estimated value
will be expected to become increasingly close to the true value). The
product term ‘M:mixture’ is the ψ3 parameter noted above,
which is also fairly close to the true value of 1.6 (estimate =
1.7).
For binary modifiers, qgcomp.emm.glm.noboot
will also
estimate the joint effect of the mixture in strata of the modifier.
Here, the effect of the mixture at M = 0 is given by ψ1, whereas the effect of
the mixture at M = 1 is
estimated below the coefficient table (and is here given by ψ1 + ψ3
= 2.4).
qfit1 <- qgcomp.emm.glm.noboot(y~x1+x2+x3+x4+x5+x6+x7,
data = dat1,
expnms = paste0("x", 1:7),
emmvar = "M",
q = 4)
qfit1
## ## Qgcomp weights/partial effects at M = 0
## Scaled effect size (positive direction, sum of positive effects = 1.05)
## x4 x1 x5 x7
## 0.393 0.237 0.219 0.151
## Scaled effect size (negative direction, sum of negative effects = -0.376)
## x6 x3 x2
## 0.449 0.435 0.116
##
##
## ## Qgcomp weights/partial effects at M =
## Scaled effect size (positive direction, sum of positive effects = 0)
## None
## Scaled effect size (negative direction, sum of negative effects = 0)
## None
##
##
## ## Mixture slope parameters (delta method CI):
##
## Estimate Std. Error Lower CI Upper CI t value Pr(>|t|)
## (Intercept) 0.14559 0.61921 -1.068034 1.3592 0.2351 0.81428
## psi1 0.67542 0.39146 -0.091832 1.4427 1.7254 0.08555
## M 0.36067 0.87113 -1.346720 2.0681 0.4140 0.67917
## M:mixture 1.68390 0.56052 0.585307 2.7825 3.0042 0.00290
##
## Estimate (CI), M=1:
## 0.67542 (-0.091832, 1.4427)
As in qgcomp
you can estimate pointwise comparisons
along the joint regression line. Here we estimate them at both values of
M (via the emmval
parameter).
## quantile quantile.midpoint hx mean.diff se.diff ll.diff ul.diff
## 1 0 0.125 0.1455920 0.0000000 0.0000000 0.00000000 0.000000
## 2 1 0.375 0.8210108 0.6754188 0.3914617 -0.09183203 1.442670
## 3 2 0.625 1.4964296 1.3508376 0.7829234 -0.18366407 2.885339
## 4 3 0.875 2.1718484 2.0262564 1.1743851 -0.27549610 4.328009
## emm_level
## 1 0
## 2 0
## 3 0
## 4 0
## quantile quantile.midpoint hx mean.diff se.diff ll.diff ul.diff
## 1 0 0.125 0.5062631 0.000000 0.0000000 0.000000 0.000000
## 2 1 0.375 2.8655840 2.359321 0.4011708 1.573041 3.145601
## 3 2 0.625 5.2249050 4.718642 0.8023416 3.146081 6.291203
## 4 3 0.875 7.5842259 7.077963 1.2035125 4.719122 9.436804
## emm_level
## 1 1
## 2 1
## 3 1
## 4 1
For qgcomp.emm.glm.noboot
fits, a set of “weights” will
be given that are interpreted as the proportion of a “partial” effect
for each variable. That is, ψ1 will represent the
joint effect of multiple exposures, some of which will have independent
effects that are positive, and some will have negative independent
effects. For example, the “negative partial effect” is simply the sum of
all of the negative independent effects (this is only given for a model
in which all exposures are included via linear terms and no interactions
among exposures occur). These weights are conditional on the fitted
model, and so are not “estimates” per se and will not have associated
confidence intervals. Nonetheless, the weights are useful for
interpretation of the joint effect.
Notably, with product terms in the model for the joint effect, a different set of weights will be generated at every value of the modifier. Here, we can plot the weights at M=0 and M=1.
Weights for M = 0
Weights for M = 1
For non-linear qgcomp fits, or to get marginal estimates of the
exposure-response curve (i.e. conditional only on the modifier), we can
use the qgcomp.emm.boot
function. Here we just repeat the
original fit, which yields similar evidence that there is substantial
statistical interaction on the additive scale, so that we would expect
the joint exposure effect estimate to be greater for M=1 than for M=0,
which corroborates the fit above (the point estimates are identical, as
expected in this case due to no non-modifier covariates, so this is not
a surprise).
qfit1b <- qgcomp.emm.boot(y~x1+x2+x3+x4+x5+x6+x7,
data=dat1,
expnms = paste0("x", 1:7),
emmvar = "M",
q = 4)
qfit1b
## ## Mixture slope parameters (bootstrap CI):
##
## Estimate Std. Error Lower CI Upper CI t value Pr(>|t|)
## (Intercept) 0.14559 0.60441 -1.039024 1.3302 0.2409 0.809819
## psi1 0.67542 0.38573 -0.080604 1.4314 1.7510 0.081029
## M 0.36067 0.81015 -1.227190 1.9485 0.4452 0.656522
## M:mixture 1.68390 0.52371 0.657445 2.7104 3.2153 0.001454
Pointwise comparisons for M = 0
Pointwise comparisons for M = 1
Visually, it’s easier to compare two plots with the same y-axis. Here
we can see that the joint regression curve is steeper at M=1 than it is
at M=0. You will get some notes from ggplot2 about replacing scales,
which can be ignored or suppressed using
suppressMessages(print( <ggplot statements > ))
.
Visually, you can see substantial evidence for interaction due to the
diverging slopes and non-overlapping confidence intervals.
library(ggplot2) # need to explicitly call ggplot
pp0b = plot(qfit1b, emmval=0, suppressprint=TRUE, geom_only=TRUE,
pointwisebars=TRUE, modelfitline=FALSE, modelband=FALSE)
pp1b = plot(qfit1b, emmval=1, suppressprint=TRUE, geom_only=TRUE,
pointwisebars=TRUE, modelfitline=FALSE, modelband=FALSE)
# relabel the plots by directly modifying the ggplot2 geometries.
# "col" is a column that sets the color
# "point" and "errorbar" are names given to ggplot2::geom_point and
# ggplot2::geom_error bar geometries
pp0b$point$data$col = "M=0"
pp1b$point$data$col = "M=1"
pp0b$errorbar$data$col = "M=0"
pp1b$errorbar$data$col = "M=1"
# jitter along the x axis so the points do not overlap
pp0b$point$data$x = pp0b$point$data$x - 0.01
pp1b$point$data$x = pp1b$point$data$x + 0.01
pp0b$errorbar$data$x = pp0b$errorbar$data$x - 0.01
pp1b$errorbar$data$x = pp1b$errorbar$data$x + 0.01
suppressMessages(print(
ggplot() + pp0b + pp1b + scale_colour_viridis_d(name="") +
labs(title="Bootstrap approach")
))
Pointwise comparisons (same scale) for M = 0
Another approach to non-linear qgcomp fits, or to get marginal
estimates of the exposure-response curve (i.e. conditional only on the
modifier), we can use the qgcomp.emm.glm.ee
function. This
approach uses estimating equation methodology for fitting. Estimating
equation methodology also underlies GEE (generalized estimating
equation) methods, which are frequently employed in longitudinal data
analysis because they can yield appropriate standard errors for
clustered data. In qgcomp.emm.glm.ee
, the standard errors
are also cluster robust (as long as the id variable is specified), but
it can also be used to obtain non-linear fits in the same way that the
bootstrapped estimator is above. Relative to the bootstrapped estimator,
the estimating equations estimator will be less computationally
intensive and will yield standard errors that are not subject to
simulation error (and thus will be more repeatable). In general, the
estimating equation method should be preferable for many of the cases in
which one might be considering the bootstrapped estimator. Because
standard errors are estimated differently, you will see differences in
confidence intervals between qgcomp.emm.glm.ee
and
qgcomp.emm.glm.noboot
even when the models are otherwise
identical.
Here we just repeat the analysis with the bootstrapped fit, which yields similar evidence that there is substantial statistical interaction on the additive scale, so that we would expect the joint exposure effect estimate to be greater for M=1 than for M=0, which corroborates the fit above (the point estimates are identical, as expected in this case due to no non-modifier covariates, so this is not a surprise).
qfit1ee <- qgcomp.emm.glm.ee(y~x1+x2+x3+x4+x5+x6+x7,
data=dat1,
expnms = paste0("x", 1:7),
emmvar = "M",
q = 4)
qfit1ee
## ## Mixture slope parameters (robust CI):
##
## Estimate Std. Error Lower CI Upper CI t value Pr(>|t|)
## (Intercept) 0.14559 0.56468 -0.961162 1.2523 0.2578 0.796725
## psi1 0.67542 0.36103 -0.032196 1.3830 1.8708 0.062407
## M 0.36067 0.77844 -1.165035 1.8864 0.4633 0.643485
## mixture:M 1.68390 0.50114 0.701696 2.6661 3.3602 0.000886
##
## Estimate (CI), M=1:
## 0.67542 (-0.032196, 1.383)
Here we skip to overlaying the two plots
pp0ee = plot(qfit1b, emmval=0, suppressprint=TRUE, geom_only=TRUE,
pointwisebars=TRUE, modelfitline=FALSE, modelband=FALSE)
pp1ee = plot(qfit1b, emmval=1, suppressprint=TRUE, geom_only=TRUE,
pointwisebars=TRUE, modelfitline=FALSE, modelband=FALSE)
# relabel the plots by directly modifying the ggplot2 geometries.
# "col" is a column that sets the color
# "point" and "errorbar" are names given to ggplot2::geom_point and
# ggplot2::geom_error bar geometries
pp0ee$point$data$col = "M=0"
pp1ee$point$data$col = "M=1"
pp0ee$errorbar$data$col = "M=0"
pp1ee$errorbar$data$col = "M=1"
# jitter along the x axis so the points do not overlap
pp0ee$point$data$x = pp0ee$point$data$x - 0.01
pp1ee$point$data$x = pp1ee$point$data$x + 0.01
pp0ee$errorbar$data$x = pp0ee$errorbar$data$x - 0.01
pp1ee$errorbar$data$x = pp1ee$errorbar$data$x + 0.01
suppressMessages(print(
ggplot() + pp0ee + pp1ee + scale_colour_viridis_d(name="") +
labs(title="Estimating equations approach")
))
Pointwise comparisons for M = 0
Now we can simulate data under a categorical modifier, and the
simdata_quantized_emm
function will pick some convenient
defaults. We will also use a binary outcome with N=300 to allow that use
of a categorical modifier with a rare binary outcome will be subject to
low power and potential for bootstrapping to fail due to empty strata in
some bootstrap samples.
Here is a good place to note that simdata_quantized_emm
is provided mainly as a learning tool. While it could potentially be
used for simulations in a scientific publication, it likely has too many
defaults that are not user controllable that may be useful to be able to
change. Some of the underlying code that may serve as inspiration for
more comprehensive simulations can be explored via the “hidden”
(non-exported) functions qgcompint:::.quantized_design_emm
,
qgcompint:::.dgm_quantized_linear_emm
,
qgcompint:::.dgm_quantized_logistic_emm
, and
qgcompint:::.dgm_quantized_survival_emm
.
set.seed(23)
dat2 <- simdata_quantized_emm(
outcometype="logistic",
# sample size
n = 300,
# correlation between x1 and x2, x3, ...
corr=c(0.6, 0.5, 0.3, -0.3, -0.3, 0.0),
# model intercept
b0=-2,
# linear model coefficients for x1, x2, ... at referent level of interacting variable
mainterms=c(0.1, -0.1, 0.1, 0.0, 0.1, 0.1, 0.1),
# linear model coefficients for product terms between x1, x2, ... and interacting variable
prodterms = c(0.2, 0.0, 0.0, 0.0, 0.2, -0.2, 0.2),
# type of interacting variable
ztype = "categorical",
# number of levels of exposure
q = 4,
# residual variance of y
yscale = 2.0
)
## data
head(dat2)
## z x1 x2 x3 x4 x5 x6 x7 y
## 1 0 2 2 2 2 3 0 3 0
## 2 2 0 2 0 3 1 3 3 0
## 3 0 3 0 3 1 0 3 0 0
## 4 2 2 2 2 3 1 1 0 0
## 5 2 3 3 0 3 0 1 1 0
## 6 0 3 0 3 1 3 1 3 0
##
## 0 1 2
## 109 92 99
##
## 0 1
## 205 95
## x1 x2 x3 x4 x5 x6
## x1 1.00000000 0.53866667 0.5493333 0.210666667 -0.272000000 -0.285333333
## x2 0.53866667 1.00000000 0.2693333 0.186666667 -0.130666667 -0.205333333
## x3 0.54933333 0.26933333 1.0000000 0.109333333 -0.186666667 -0.184000000
## x4 0.21066667 0.18666667 0.1093333 1.000000000 -0.069333333 -0.005333333
## x5 -0.27200000 -0.13066667 -0.1866667 -0.069333333 1.000000000 0.005333333
## x6 -0.28533333 -0.20533333 -0.1840000 -0.005333333 0.005333333 1.000000000
## x7 -0.01066667 -0.05066667 -0.0480000 0.029333333 0.029333333 0.002666667
## x7
## x1 -0.010666667
## x2 -0.050666667
## x3 -0.048000000
## x4 0.029333333
## x5 0.029333333
## x6 0.002666667
## x7 1.000000000
Below is one way to fit qgcomp.emm.glm.noboot
with a
categorical modifier that exactly follows the previous code. This
approach to categorical modifiers is incorrect, in this case, due to the
format of these data. Note if you fit the model like this, where your
categorical modifier is not the proper data type,
qgcomp.emm.glm.noboot
will assume you have a continuous
modifier.
qfit.wrong <- qgcomp.emm.glm.noboot(y~x1+x2+x3+x4+x5+x6+x7,
data = dat2,
expnms = paste0("x", 1:7),
emmvar = "z",
q = 4, family=binomial())
qfit.wrong
## ## Qgcomp weights/partial effects at z = 0
## Scaled effect size (positive direction, sum of positive effects = 1.1)
## x5 x3 x1 x6
## 0.3881 0.3025 0.2401 0.0694
## Scaled effect size (negative direction, sum of negative effects = -0.214)
## x2 x7 x4
## 0.695 0.171 0.134
##
##
## ## Mixture log(OR) (delta method CI):
##
## Estimate Std. Error Lower CI Upper CI Z value Pr(>|z|)
## (Intercept) -2.72336 0.92407 -4.53451 -0.91221 -2.9471 0.003207
## psi1 0.88157 0.56080 -0.21757 1.98071 1.5720 0.115949
## z -0.43426 0.71890 -1.84329 0.97476 -0.6041 0.545799
## z:mixture 0.65881 0.44430 -0.21199 1.52961 1.4828 0.138123
as.factor()
)Instead, you should convert each categorical modifier to a “factor” prior to fitting the model. Here you can see the output for both the non-bootstrapped fit and the fit with bootstrapped confidence intervals (since there are no other covariates in the model, these two approaches estimate the same marginal effect and parameter log-odds ratio estimates will be identical). Here we see that we get a unique interaction term and main effect for each level of the modifier z.
dat2$zfactor = as.factor(dat2$z)
# using asymptotic-based confidence intervals
qfit2 <- qgcomp.emm.glm.noboot(y~x1+x2+x3+x4+x5+x6+x7,
data = dat2,
expnms = paste0("x", 1:7),
emmvar = "zfactor",
q = 4, family=binomial())
# using bootstrap based confidence intervals (estimate a)
set.seed(12312)
qfit2b <- qgcomp.emm.boot(y~x1+x2+x3+x4+x5+x6+x7,
data = dat2,
expnms = paste0("x", 1:7),
emmvar = "zfactor",
B = 10, # set higher when using in real data
q = 4, family=binomial(), rr = FALSE)
# using estimating equation based confidence intervals (estimate ee)
set.seed(12312)
qfit2ee <- qgcomp.emm.ee(y~x1+x2+x3+x4+x5+x6+x7,
data = dat2,
expnms = paste0("x", 1:7),
emmvar = "zfactor",
q = 4, family=binomial(), rr = FALSE)
qfit2
## ## Qgcomp weights/partial effects at zfactor = 0
## Scaled effect size (positive direction, sum of positive effects = 1.15)
## x5 x3 x1 x6
## 0.3914 0.3857 0.2014 0.0215
## Scaled effect size (negative direction, sum of negative effects = -0.382)
## x7 x4 x2
## 0.427 0.351 0.222
##
##
## ## Mixture log(OR) (delta method CI):
##
## Estimate Std. Error Lower CI Upper CI Z value Pr(>|z|)
## (Intercept) -2.80404 1.08650 -4.93355 -0.67454 -2.5808 0.009857
## psi1 0.76337 0.65463 -0.51969 2.04642 1.1661 0.243576
## zfactor1 -0.89742 1.55815 -3.95134 2.15651 -0.5759 0.564650
## zfactor1:mixture 1.46319 0.97182 -0.44155 3.36793 1.5056 0.132166
## zfactor2 -0.52448 1.48432 -3.43369 2.38474 -0.3533 0.723830
## zfactor2:mixture 1.09584 0.90958 -0.68691 2.87859 1.2048 0.228290
## ## Mixture log(OR) (bootstrap CI):
##
## Estimate Std. Error Lower CI Upper CI Z value Pr(>|z|)
## (Intercept) -2.80404 1.26019 -5.27396 -0.33413 -2.2251 0.02607
## psi1 0.76337 0.66973 -0.54927 2.07600 1.1398 0.25436
## zfactor1 -0.89742 1.94264 -4.70493 2.91009 -0.4620 0.64411
## zfactor2 -0.52448 1.86269 -4.17529 3.12634 -0.2816 0.77827
## zfactor1:mixture 1.46319 1.12314 -0.73812 3.66450 1.3028 0.19265
## zfactor2:mixture 1.09584 1.09279 -1.04599 3.23768 1.0028 0.31596
## ## Mixture log(OR) (robust CI):
##
## Estimate Std. Error Lower CI Upper CI Z value Pr(>|z|)
## (Intercept) -2.80404 1.04140 -4.84515 -0.76293 -2.6926 0.00709
## psi1 0.76337 0.62774 -0.46699 1.99372 1.2160 0.22397
## zfactor1 -0.89742 1.46829 -3.77522 1.98039 -0.6112 0.54107
## zfactor2 -0.52448 1.50004 -3.46451 2.41555 -0.3496 0.72661
## mixture:zfactor1 1.46319 0.90814 -0.31674 3.24312 1.6112 0.10714
## mixture:zfactor2 1.09584 0.91844 -0.70428 2.89596 1.1932 0.23281
Here are some miscellaneous functions for getting point estimates and bounds for various comparisons at specific values of the modifier.
## ## Qgcomp weights/partial effects at zfactor = 0
## Scaled effect size (positive direction, sum of positive effects = 1.15)
## x5 x3 x1 x6
## 0.3914 0.3857 0.2014 0.0215
## Scaled effect size (negative direction, sum of negative effects = -0.382)
## x7 x4 x2
## 0.427 0.351 0.222
## quantile quantile.midpoint hx or se.lnor ll.or ul.or
## 1 0 0.125 -2.8040444 1.000000 0.0000000 1.0000000 1.000000
## 2 1 0.375 -2.0406788 2.145485 0.6546337 0.5947033 7.740173
## 3 2 0.625 -1.2773131 4.603106 1.3092673 0.3536720 59.910280
## 4 3 0.875 -0.5139474 9.875896 1.9639010 0.2103299 463.715942
## emm_level
## 1 0
## 2 0
## 3 0
## 4 0
## output stratum specific joint effect estimate for the mixture at Z=2
print(getstrateffects(qfit2, emmval=2))
## Joint effect at zfactor=2
## Estimate Std. Error Lower CI Upper CI z value Pr(>|z|)
## Mixture 1.8592 0.6315 0.62148 3.0969 2.9441 0.003239
## ## Qgcomp weights/partial effects at zfactor = 2
## Scaled effect size (positive direction, sum of positive effects = 1.87)
## x5 x1 x2 x3 x7 x6
## 0.41602 0.38949 0.09062 0.07037 0.02519 0.00831
## Scaled effect size (negative direction, sum of negative effects = -0.00714)
## x4
## 1
## quantile quantile.midpoint hx or se.lnor ll.or ul.or
## 1 0 0.125 -3.3285212 1.000000 0.000000 1.000000 1.0000
## 2 1 0.375 -1.4693122 6.418658 0.631504 1.861688 22.1300
## 3 2 0.625 0.3898968 41.199165 1.263008 3.465884 489.7369
## 4 3 0.875 2.2491058 264.443333 1.894512 6.452396 10837.8769
## emm_level
## 1 2
## 2 2
## 3 2
## 4 2
## output stratum specific joint effect estimate for the mixture at Z=2 from bootstrapped fit
print(getstrateffects(qfit2b, emmval=2))
## Joint effect at zfactor=2
## Estimate Std. Error Lower CI Upper CI z value Pr(>|z|)
## Mixture 1.8592 0.6315 0.62148 3.0969 2.9441 0.003239
## output pointwise comparisons at Z=2 from bootstrapped fit
print(pointwisebound(qfit2b, emmval=2))
## quantile quantile.midpoint hx or se.lnor ll.or
## 1 0 0.125 -3.3285212 1.000000 0.0000000 1.000000
## 2 1 0.375 -1.4693122 6.418658 0.8057764 1.323019
## 3 2 0.625 0.3898968 41.199165 1.6115527 1.750380
## 4 3 0.875 2.2491058 264.443333 2.4173291 2.315786
## ul.or ll.linpred ul.linpred emm_level
## 1 1.00000 -3.328521 -3.3285212 2
## 2 31.14026 -3.048605 0.1099805 2
## 3 969.71606 -2.768689 3.5484821 2
## 4 30197.21448 -2.488772 6.9869838 2
## output modelwise confidence bounds at Z=2 from bootstrapped fit
print(modelbound(qfit2b, emmval=2))
## quantile quantile.midpoint linpred o se.pw ll.pw
## 1 0 0.125 -3.3285212 0.03584608 1.3936703 0.00233425
## 2 1 0.375 -1.4693122 0.23008368 0.6366945 0.06605877
## 3 2 0.625 0.3898968 1.47682837 0.4086489 0.66296013
## 4 3 0.875 2.2491058 9.47925560 1.1077721 1.08102783
## ul.pw ll.simul ul.simul emm_level
## 1 0.5504729 0.003315715 0.2510315 2
## 2 0.8013849 0.083425466 0.4932845 2
## 3 3.2898238 0.871064034 2.9511398 2
## 4 83.1211592 1.904740357 52.8130752 2
Visually, it’s easier to compare predictions across all levels of the EMM variable on the same plot. This plot looks at bounds on the marginal structural regression model line.
library(ggplot2) # need to explicitly call ggplot
pp0ee = plot(qfit2ee, emmval=0, suppressprint=TRUE, geom_only=TRUE,
modelband=TRUE, pointwisebars=FALSE, modelfitline=TRUE)
pp1ee = plot(qfit2ee, emmval=1, suppressprint=TRUE, geom_only=TRUE,
modelband=TRUE, pointwisebars=FALSE, modelfitline=TRUE)
pp2ee = plot(qfit2ee, emmval=2, suppressprint=TRUE, geom_only=TRUE,
modelband=TRUE, pointwisebars=FALSE, modelfitline=TRUE)
# relabel the plots uisn
pp0ee$line$data$col = "Z=0"
pp1ee$line$data$col = "Z=1"
pp2ee$line$data$col = "Z=2"
pp0ee$ribbon$data$col = "Z=0"
pp1ee$ribbon$data$col = "Z=1"
pp2ee$ribbon$data$col = "Z=2"
# jitter along the x axis so the points do not overlap
pp0ee$line$data$x = pp0ee$line$data$x - 0.01
pp2ee$line$data$x = pp2ee$line$data$x + 0.01
pp0ee$ribbon$data$x = pp0ee$ribbon$data$x - 0.01
pp2ee$ribbon$data$x = pp2ee$ribbon$data$x + 0.01
ggplot() + pp0ee + pp1ee + pp2ee + scale_color_viridis_d(name="") + scale_fill_viridis_d(name="", alpha=0.25)
## Scale for y is already present.
## Adding another scale for y, which will replace the existing scale.
## Scale for x is already present.
## Adding another scale for x, which will replace the existing scale.
## Scale for fill is already present.
## Adding another scale for fill, which will replace the existing scale.
## Scale for colour is already present.
## Adding another scale for colour, which will replace the existing scale.
## Scale for y is already present.
## Adding another scale for y, which will replace the existing scale.
## Scale for x is already present.
## Adding another scale for x, which will replace the existing scale.
## Scale for fill is already present.
## Adding another scale for fill, which will replace the existing scale.
## Scale for colour is already present.
## Adding another scale for colour, which will replace the existing scale.
## Scale for colour is already present.
## Adding another scale for colour, which will replace the existing scale.
## Scale for fill is already present.
## Adding another scale for fill, which will replace the existing scale.
Pointwise comparisons (same scale) for M = 0
It is possible to obtain a global test of (no) interaction for a
multi-level modifier using the qgcomp
package and the
anova
function. You need to fit a model without interaction
(via qgcomp
) and then with interaction. The test is done on
the “underlying” conditional model corresponding to the original data,
rather than the marginal structural model. The methods for
qgcomp.glm.[no]boot
and
qgcomp.emm.glm.[no]boot
utilize the anova.glm
function from the stats
package (note the slightly
different specification), while the methods for
qgcomp.glm.ee
and qgcomp.emm.glm.ee
use
anova.geeglm
function methods from the geepack
R package. Both tests return similar (not identical) results.
library("qgcomp")
dat2$zfactor = as.factor(dat2$z)
catfitoverall <- qgcomp.glm.noboot(y~x1+x2+x3+x4+x5+x6+x7,
data = dat2,
expnms = paste0("x", 1:7),
q = 4, family=binomial())
catfitemm <- qgcomp.emm.glm.noboot(y~x1+x2+x3+x4+x5+x6+x7,
data = dat2,
expnms = paste0("x", 1:7),
emmvar = "zfactor",
q = 4, family=binomial())
catfitoverallee <- qgcomp.glm.ee(y~x1+x2+x3+x4+x5+x6+x7,
data = dat2,
expnms = paste0("x", 1:7),
q = 4, family=binomial())
catfitemmee <- qgcomp.emm.glm.ee(y~x1+x2+x3+x4+x5+x6+x7,
data = dat2,
expnms = paste0("x", 1:7),
emmvar = "zfactor",
q = 4, family=binomial(), rr=FALSE)
anova(catfitoverall$fit, catfitemm$fit, test = "Chisq")
## Analysis of Deviance Table
##
## Model 1: y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7
## Model 2: y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x1 * zfactor1 + x2 * zfactor1 +
## x3 * zfactor1 + x4 * zfactor1 + x5 * zfactor1 + x6 * zfactor1 +
## x7 * zfactor1 + x1 * zfactor2 + x2 * zfactor2 + x3 * zfactor2 +
## x4 * zfactor2 + x5 * zfactor2 + x6 * zfactor2 + x7 * zfactor2
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 292 338.93
## 2 276 306.69 16 32.243 0.009295 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Analysis of 'Wald statistic' Table
##
## Model 1 y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7, expnms: x1, x2, x3, x4, x5, x6, x7, EMM: zfactor
## Model 2 y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7, expnms: x1, x2, x3, x4, x5, x6, x7
## Df X2 P(>|Chi|)
## 1 16 29.764 0.01927 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
As above, non-numerical modifiers must be coded as factors. These are straightforward, and this example follows with no commentary.
set.seed(23)
dat3 <- simdata_quantized_emm(
outcometype="continuous",
n = 300,
corr=c(0.6, 0.5, 0.3, -0.3, -0.3, 0.0),
b0=-2,
mainterms=c(0.1, -0.1, 0.1, 0.0, 0.1, 0.1, 0.1),
prodterms = c(0.2, 0.0, 0.0, 0.0, 0.2, -0.2, 0.2),
ztype = "categorical",
q = 4,
yscale = 2.0
)
dat3$zletter = as.factor(with(dat3, ifelse(z==0, "a", ifelse(z==1, "b", "c"))))
with(dat3, table(z, zletter))
## zletter
## z a b c
## 0 109 0 0
## 1 0 92 0
## 2 0 0 99
qfit_letter <- qgcomp.emm.glm.noboot(y~x1+x2+x3+x4+x5+x6+x7,
data = dat3,
expnms = paste0("x", 1:7),
emmvar = "zletter",
q = 4, family=gaussian())
qfit_letteree <- qgcomp.emm.glm.ee(y~x1+x2+x3+x4+x5+x6+x7,
data = dat3,
expnms = paste0("x", 1:7),
emmvar = "zletter",
q = 4, family=gaussian())
print(qfit_letter)
## ## Qgcomp weights/partial effects at zletter = a
## Scaled effect size (positive direction, sum of positive effects = 1.09)
## x3 x7 x2 x4 x6
## 0.4099 0.2452 0.1330 0.1122 0.0997
## Scaled effect size (negative direction, sum of negative effects = -0.823)
## x1 x5
## 0.713 0.287
##
##
## ## Mixture slope parameters (delta method CI):
##
## Estimate Std. Error Lower CI Upper CI t value Pr(>|t|)
## (Intercept) -2.19374 0.73751 -3.63924 -0.74824 -2.9745 0.003193
## psi1 0.26804 0.46378 -0.64095 1.17704 0.5780 0.563763
## zletterb -0.48345 1.09771 -2.63492 1.66801 -0.4404 0.659977
## zletterb:mixture 1.08966 0.70934 -0.30062 2.47993 1.5362 0.125639
## zletterc 0.76945 1.02824 -1.24587 2.78477 0.7483 0.454905
## zletterc:mixture 0.59209 0.65586 -0.69337 1.87755 0.9028 0.367432
## ## Mixture slope parameters (robust CI):
##
## Estimate Std. Error Lower CI Upper CI t value Pr(>|t|)
## (Intercept) -2.19374 0.70559 -3.576665 -0.81082 -3.1091 0.002074
## psi1 0.26804 0.43462 -0.583790 1.11988 0.6167 0.537918
## zletterb -0.48345 0.92428 -2.295010 1.32811 -0.5231 0.601355
## zletterc 0.76945 0.94934 -1.091227 2.63013 0.8105 0.418349
## mixture:zletterb 1.08966 0.60262 -0.091446 2.27076 1.8082 0.071665
## mixture:zletterc 0.59209 0.60277 -0.589316 1.77349 0.9823 0.326824
## output stratum specific joint effect estimate for the mixture at Zletter="a"
print(getstrateffects(qfit_letter, emmval="a"))
## Joint effect at zletter=a
## Estimate Std. Error Lower CI Upper CI z value Pr(>|z|)
## Mixture 0.26804 0.46378 -0.64095 1.177 0.578 0.5633
## Joint effect at zletter=a
## Estimate Std. Error Lower CI Upper CI z value Pr(>|z|)
## Mixture 0.26804 0.43462 -0.58379 1.1199 0.6167 0.5374
## ## Qgcomp weights/partial effects at zletter = a
## Scaled effect size (positive direction, sum of positive effects = 1.09)
## x3 x7 x2 x4 x6
## 0.4099 0.2452 0.1330 0.1122 0.0997
## Scaled effect size (negative direction, sum of negative effects = -0.823)
## x1 x5
## 0.713 0.287
## output predictions at Zletter="b"
print(pointwisebound(qfit_letter, emmval="b", pointwiseref = 1, alpha=0.05))
## quantile quantile.midpoint hx mean.diff se.diff ll.diff ul.diff
## 1 0 0.125 -2.6771928 0.000000 0.0000000 0.0000000 0.000000
## 2 1 0.375 -1.3194888 1.357704 0.5367183 0.3057553 2.409653
## 3 2 0.625 0.0382151 2.715408 1.0734367 0.6115107 4.819305
## 4 3 0.875 1.3959191 4.073112 1.6101550 0.9172660 7.228958
## emm_level
## 1 b
## 2 b
## 3 b
## 4 b
## quantile quantile.midpoint hx mean.diff se.diff ll.diff ul.diff
## 1 0 0.125 -2.6771928 0.000000 0.0000000 0.0000000 0.000000
## 2 1 0.375 -1.3194888 1.357704 0.4174364 0.5395436 2.175864
## 3 2 0.625 0.0382151 2.715408 0.8348728 1.0790873 4.351729
## 4 3 0.875 1.3959191 4.073112 1.2523092 1.6186309 6.527593
## ll.linpred ul.linpred emm_level
## 1 -2.677193 -2.6771928 b
## 2 -2.137649 -0.5013285 b
## 3 -1.598106 1.6745357 b
## 4 -1.058562 3.8504000 b
Here we simulate some data, similar to prior datasets, where we use a continuous modifier.
set.seed(23)
dat3 <- simdata_quantized_emm(
outcometype="continuous",
# sample size
n = 100,
# correlation between x1 and x2, x3, ...
corr=c(0.8, 0.6, 0.3, -0.3, -0.3, -0.3),
# model intercept
b0=-2,
# linear model coefficients for x1, x2, ... at referent level of interacting variable
mainterms=c(0.3, -0.1, 0.1, 0.0, 0.3, 0.1, 0.1),
# linear model coefficients for product terms between x1, x2, ... and interacting variable
prodterms = c(1.0, 0.0, 0.0, 0.0, 0.2, 0.2, 0.2),
# type of interacting variable
ztype = "continuous",
# number of levels of exposure
q = 4,
# residual variance of y
yscale = 2.0
)
names(dat3)[which(names(dat3)=="z")] = "CoM"
head(dat3)
## CoM x1 x2 x3 x4 x5 x6 x7 y
## 1 0.1932123 3 3 3 3 0 0 3 -1.0597465
## 2 -0.4346821 2 2 3 2 0 1 1 -4.9330438
## 3 0.9132671 1 1 0 1 2 2 2 -1.8409107
## 4 1.7933881 1 1 2 1 3 2 0 1.6776239
## 5 0.9966051 2 2 0 2 0 1 1 -1.0562066
## 6 1.1074905 1 1 3 0 2 1 0 -0.9831373
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2.41826 -0.53582 0.04267 0.07475 0.84181 1.97606
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -8.6878 -2.7252 -0.8718 -0.7253 1.2805 8.0743
## x1 x2 x3 x4 x5 x6 x7
## x1 1.000 0.920 0.632 0.320 -0.400 -0.520 -0.200
## x2 0.920 1.000 0.616 0.320 -0.400 -0.472 -0.136
## x3 0.632 0.616 1.000 0.232 -0.360 -0.288 -0.240
## x4 0.320 0.320 0.232 1.000 -0.040 -0.280 0.032
## x5 -0.400 -0.400 -0.360 -0.040 1.000 0.192 0.072
## x6 -0.520 -0.472 -0.288 -0.280 0.192 1.000 0.024
## x7 -0.200 -0.136 -0.240 0.032 0.072 0.024 1.000
Fits are done as in the previous example (here they are shown without any further commentary)
qfit3 <- qgcomp.emm.glm.noboot(y~x1+x2+x3+x4+x5+x6+x7,
data = dat3,
expnms = paste0("x", 1:7),
emmvar = "CoM",
q = 4)
qfit3
## ## Qgcomp weights/partial effects at CoM = 0
## Scaled effect size (positive direction, sum of positive effects = 0.93)
## x3 x1 x7 x6 x4 x5
## 0.29883 0.24796 0.20602 0.20568 0.03698 0.00453
## Scaled effect size (negative direction, sum of negative effects = -0.482)
## x2
## 1
##
##
## ## Mixture slope parameters (delta method CI):
##
## Estimate Std. Error Lower CI Upper CI t value Pr(>|t|)
## (Intercept) -1.40872 0.82889 -3.03332 0.21587 -1.6995 0.09292
## psi1 0.44806 0.54005 -0.61041 1.50654 0.8297 0.40907
## CoM -1.70779 0.95928 -3.58795 0.17236 -1.7803 0.07864
## CoM:mixture 2.78425 0.62382 1.56160 4.00691 4.4633 2.49e-05
qfit3b <- qgcomp.emm.boot(y~x1+x2+x3+x4+x5+x6+x7,
data = dat3,
expnms = paste0("x", 1:7),
emmvar = "CoM",
B=10,
q = 4)
qfit3b
## ## Mixture slope parameters (bootstrap CI):
##
## Estimate Std. Error Lower CI Upper CI t value Pr(>|t|)
## (Intercept) -1.40872 0.80304 -2.98265 0.16521 -1.7542 0.08308
## psi1 0.44806 0.50720 -0.54603 1.44215 0.8834 0.37957
## CoM -1.70779 0.75007 -3.17790 -0.23769 -2.2768 0.02537
## CoM:mixture 2.78425 0.52286 1.75946 3.80905 5.3250 8.461e-07
qfit3ee <- qgcomp.emm.ee(y~x1+x2+x3+x4+x5+x6+x7,
data = dat3,
expnms = paste0("x", 1:7),
emmvar = "CoM",
q = 4)
qfit3ee
## ## Mixture slope parameters (robust CI):
##
## Estimate Std. Error Lower CI Upper CI t value Pr(>|t|)
## (Intercept) -1.40872 0.78107 -2.93959 0.122146 -1.8036 0.07493
## psi1 0.44806 0.49066 -0.51361 1.409735 0.9132 0.36379
## CoM -1.70779 0.83509 -3.34454 -0.071046 -2.0450 0.04402
## mixture:CoM 2.78425 0.54885 1.70853 3.859975 5.0729 2.351e-06
## ## Qgcomp weights/partial effects at CoM = 0
## Scaled effect size (positive direction, sum of positive effects = 0.93)
## x3 x1 x7 x6 x4 x5
## 0.29883 0.24796 0.20602 0.20568 0.03698 0.00453
## Scaled effect size (negative direction, sum of negative effects = -0.482)
## x2
## 1
## output stratum specific joint effect estimate for the mixture at CoM=0
print(getstrateffects(qfit3, emmval=0))
## Joint effect at CoM=0
## Estimate Std. Error Lower CI Upper CI z value Pr(>|z|)
## Mixture 0.44806 0.54005 -0.61041 1.5065 0.8297 0.4067
## quantile quantile.midpoint hx mean.diff se.diff ll.diff
## 1 0 0.125 -1.40872323 0.0000000 0.0000000 0.0000000
## 2 1 0.375 -0.96065965 0.4480636 0.5400468 -0.6104086
## 3 2 0.625 -0.51259608 0.8961271 1.0800935 -1.2208173
## 4 3 0.875 -0.06453251 1.3441907 1.6201403 -1.8312259
## ul.diff emm_level
## 1 0.000000 0
## 2 1.506536 0
## 3 3.013072 0
## 4 4.519607 0
## output/plot the weights at the 80%ile of CoM
getstratweights(qfit3, emmval=quantile(dat3$CoM, .8))
## ## Qgcomp weights/partial effects at CoM = 0.975896138278307
## Scaled effect size (positive direction, sum of positive effects = 3.18)
## x1 x7 x6 x5 x2 x3
## 0.2402 0.2321 0.2212 0.1308 0.0934 0.0822
## Scaled effect size (negative direction, sum of negative effects = -0.0196)
## x4
## 1
## output stratum specific joint effect estimate for the mixture at the 80%ile of CoM
print(getstrateffects(qfit3, emmval=quantile(dat3$CoM, .8)))
## Joint effect at CoM=0.975896138278307
## Estimate Std. Error Lower CI Upper CI z value Pr(>|z|)
## Mixture 3.16521 0.80046 1.5963 4.7341 3.9542 7.678e-05
## output pointwise comparisons at at the 80%ile of CoM
print(pointwisebound(qfit3, emmval=quantile(dat3$CoM, .8), pointwiseref = 2))
## quantile quantile.midpoint hx mean.diff se.diff ll.diff ul.diff
## 1 0 0.125 -3.0753528 -3.165207 0.8004573 -4.734075 -1.596340
## 2 1 0.375 0.0898543 0.000000 0.0000000 0.000000 0.000000
## 3 2 0.625 3.2550614 3.165207 0.8004573 1.596340 4.734075
## 4 3 0.875 6.4202685 6.330414 1.6009147 3.192679 9.468149
## emm_level
## 1 0.9758961
## 2 0.9758961
## 3 0.9758961
## 4 0.9758961
Stratified effects and pointwise comparisons can also be made with bootstrapped/estimating equation fits.
## output stratum specific joint effect estimate for the mixture at CoM=0
#print(getstrateffects(qfit3b, emmval=0)) # commmented out to reduce output
print(getstrateffects(qfit3ee, emmval=0))
## Joint effect at CoM=0
## Estimate Std. Error Lower CI Upper CI z value Pr(>|z|)
## Mixture 0.44806 0.49066 -0.51361 1.4097 0.9132 0.3611
## output pointwise comparisons at CoM=0
#print(pointwisebound(qfit3b, emmval=0)) # commmented out to reduce output
print(pointwisebound(qfit3ee, emmval=0))
## quantile quantile.midpoint hx mean.diff se.diff ll.diff ul.diff
## 1 0 0.125 -1.40872323 0.0000000 0.0000000 0.000000 0.000000
## 2 1 0.375 -0.96065965 0.4480636 0.4906578 -0.513608 1.409735
## 3 2 0.625 -0.51259608 0.8961271 0.9813156 -1.027216 2.819470
## 4 3 0.875 -0.06453251 1.3441907 1.4719734 -1.540824 4.229206
## ll.linpred ul.linpred emm_level
## 1 -1.408723 -1.408723226 0
## 2 -1.922331 0.001011944 0
## 3 -2.435939 1.410747113 0
## 4 -2.949547 2.820482283 0
## output stratum specific joint effect estimate for the mixture at the 80%ile of CoM
#print(getstrateffects(qfit3b, emmval=quantile(dat3$CoM, .8))) # commmented out to reduce output
print(getstrateffects(qfit3ee, emmval=quantile(dat3$CoM, .8)))
## Joint effect at CoM=0.975896138278307
## Estimate Std. Error Lower CI Upper CI z value Pr(>|z|)
## Mixture 3.16521 0.58465 2.0193 4.3111 5.4138 6.169e-08
## output pointwise comparisons at at the 80%ile of CoM
#print(pointwisebound(qfit3b, emmval=quantile(dat3$CoM, .8))) # commmented out to reduce output
print(pointwisebound(qfit3ee, emmval=quantile(dat3$CoM, .8)))
## quantile quantile.midpoint hx mean.diff se.diff ll.diff ul.diff
## 1 0 0.125 -3.0753528 0.000000 0.000000 0.000000 0.000000
## 2 1 0.375 0.0898543 3.165207 0.584652 2.019310 4.311104
## 3 2 0.625 3.2550614 6.330414 1.169304 4.038620 8.622208
## 4 3 0.875 6.4202685 9.495621 1.753956 6.057931 12.933312
## ll.linpred ul.linpred emm_level
## 1 -3.0753528 -3.075353 0.9758961
## 2 -1.0560426 1.235751 0.9758961
## 3 0.9632676 5.546855 0.9758961
## 4 2.9825777 9.857959 0.9758961
## plot the pointwise effects at CoM=0 (compare bootstrap with estimating equations approach)
## note that the bootstrapped version is not valid because of only using 10 bootstrap iterations
ppb <- plot(qfit3b, emmval=0, suppressprint = TRUE, geom_only = TRUE)
ppee <- plot(qfit3ee, emmval=0, suppressprint = TRUE, geom_only = TRUE)
ppb$point$data$col = "Bootstrap"
ppb$errorbar$data$col = "Bootstrap"
ppee$point$data$col = "EE"
ppee$errorbar$data$col = "EE"
ppee$point$data$x = ppee$point$data$x + 0.01
ppee$errorbar$data$x = ppee$errorbar$data$x + 0.01
ggplot() + ppb + ppee + scale_color_grey(name="Pointwise estimates, 95% CI")
## Scale for x is already present.
## Adding another scale for x, which will replace the existing scale.
## Scale for fill is already present.
## Adding another scale for fill, which will replace the existing scale.
## Scale for colour is already present.
## Adding another scale for colour, which will replace the existing scale.
## Scale for colour is already present.
## Adding another scale for colour, which will replace the existing scale.
Non-linear joint effects of a mixture will tend to occur when
independent effects of individual exposures are non-linear or when there
is interaction on the model scale between exposures. Here is a toy
example of allowing a quadratic overall joint effect with an underlying
set of interaction terms between x1 and all other exposures. q is set to
8 for this example for reasons described in the qgcomp
package vignette. This approach adds an interaction term for both the
“main effect” of the mixture and the squared term of the mixture. The
coefficients can be intererpreted as any other quadratic/interaction
terms (which is to say, with some difficulty). Informally, the
CoM:mixture^2 term can be interpreted as the magnitude of the change in
the non-linearity of the slope due to a one unit increase in the
continuous modifier. Both the bootstrapped and estimating equations
formulations can be used to fit non-linear models, with the estimating
equations approaches being preferred for computational efficiency.
qfit3bnl <- qgcomp.emm.glm.boot(y~x1+x2+x3+x4+x5+x6+x7 + x1*(x2 + x3 + x4 + x5 + x6 +x7),
data = dat3,
expnms = paste0("x", 1:7),
emmvar = "CoM",
B = 10, # set higher in practice
q = 8, degree= 2)
qfit3bnlee <- qgcomp.emm.glm.ee(y~x1+x2+x3+x4+x5+x6+x7 + x1*(x2 + x3 + x4 + x5 + x6 +x7),
data = dat3,
expnms = paste0("x", 1:7),
emmvar = "CoM",
q = 8, degree= 2)
qfit3bnl
## ## Mixture slope parameters (bootstrap CI):
##
## Estimate Std. Error Lower CI Upper CI t value Pr(>|t|)
## (Intercept) -1.8359e+00 9.2572e-01 -3.6503e+00 -2.1515e-02 -1.9832 0.05096
## psi1 4.5330e-01 7.8571e-01 -1.0867e+00 1.9933e+00 0.5769 0.56569
## psi2 -6.6350e-02 1.9034e-01 -4.3940e-01 3.0670e-01 -0.3486 0.72836
## CoM -1.1514e+00 6.2803e-01 -2.3823e+00 7.9540e-02 -1.8333 0.07067
## CoM:mixture 1.6314e+00 3.2228e-01 9.9971e-01 2.2630e+00 5.0620 2.813e-06
## CoM:mixture^2 -3.5123e-17 3.8621e-17 -1.1082e-16 4.0572e-17 -0.9094 0.36599
## ## Mixture slope parameters (robust CI):
##
## Estimate Std. Error Lower CI Upper CI t value
## (Intercept) -1.8359e+00 8.3109e-01 -3.4648e+00 -2.0699e-01 -2.2090
## psi1 4.5330e-01 6.5266e-01 -8.2589e-01 1.7325e+00 0.6945
## I(mixture^2) -6.6350e-02 1.4202e-01 -3.4469e-01 2.1199e-01 -0.4672
## CoM -1.1514e+00 6.1548e-01 -2.3577e+00 5.4937e-02 -1.8707
## mixture:CoM 1.6314e+00 2.6526e-01 1.1115e+00 2.1513e+00 6.1502
## I(mixture^2):CoM 1.5458e-15 2.8724e-12 -5.6282e-12 5.6313e-12 0.0005
## Pr(>|t|)
## (Intercept) 0.03019
## psi1 0.48946
## I(mixture^2) 0.64169
## CoM 0.06524
## mixture:CoM 3.332e-08
## I(mixture^2):CoM 0.99957
Some effect estimation tools are not all enabled for non-linear fits
and will produce an error. However, as with previous fits, we can
estimate pointwise differences along the quantiles, and plot the
marginal structural model regression line at various values of the joint
quantized exposures. Here you can see that the regression line is
expected to be steeper at higher levels of the modifier, as in previous
fits. We can explictly ask for model confidence bands (which are given
by the modelbound
function), which pull in confidence
limits for the regression line based on the bootstrap distribution of
estimates.
## quantile quantile.midpoint hx mean.diff se.diff ll.diff
## 1 0 0.0625 -0.6845197 0.000000 0.0000000 0.000000
## 2 1 0.1875 -1.9289318 -1.244412 0.6167308 -2.453182
## 3 2 0.3125 -3.3060446 -2.621525 1.0442651 -4.668247
## 4 3 0.4375 -4.8158581 -4.131338 1.3590510 -6.795029
## 5 4 0.5625 -6.4583724 -5.773853 1.6883742 -9.083005
## 6 5 0.6875 -8.2335875 -7.549068 2.1876819 -11.836845
## 7 6 0.8125 -10.1415032 -9.456983 2.9701500 -15.278370
## 8 7 0.9375 -12.1821197 -11.497600 4.0756472 -19.485722
## ul.diff ll.linpred ul.linpred emm_level
## 1 0.00000000 -0.6845197 -0.6845197 -1
## 2 -0.03564181 -3.1377020 -0.7201616 -1
## 3 -0.57480287 -5.3527666 -1.2593226 -1
## 4 -1.46764731 -7.4795492 -2.1521671 -1
## 5 -2.46469996 -9.7675251 -3.1492197 -1
## 6 -3.26128994 -12.5213652 -3.9458097 -1
## 7 -3.63559648 -15.9628902 -4.3201162 -1
## 8 -3.50947829 -20.1702414 -4.1939980 -1
## quantile quantile.midpoint hx mean.diff se.diff ll.diff
## 1 0 0.0625 -0.6845197 0.000000 0.0000000 0.000000
## 2 1 0.1875 -1.9289318 -1.244412 0.6167308 -2.453182
## 3 2 0.3125 -3.3060446 -2.621525 1.0442651 -4.668247
## 4 3 0.4375 -4.8158581 -4.131338 1.3590510 -6.795029
## 5 4 0.5625 -6.4583724 -5.773853 1.6883742 -9.083005
## 6 5 0.6875 -8.2335875 -7.549068 2.1876819 -11.836845
## 7 6 0.8125 -10.1415032 -9.456983 2.9701500 -15.278370
## 8 7 0.9375 -12.1821197 -11.497600 4.0756472 -19.485722
## ul.diff ll.linpred ul.linpred emm_level
## 1 0.00000000 -0.6845197 -0.6845197 -1
## 2 -0.03564181 -3.1377020 -0.7201616 -1
## 3 -0.57480287 -5.3527666 -1.2593226 -1
## 4 -1.46764731 -7.4795492 -2.1521671 -1
## 5 -2.46469996 -9.7675251 -3.1492197 -1
## 6 -3.26128994 -12.5213652 -3.9458097 -1
## 7 -3.63559648 -15.9628902 -4.3201162 -1
## 8 -3.50947829 -20.1702414 -4.1939980 -1
## quantile quantile.midpoint hx mean.diff se.diff ll.diff ul.diff
## 1 0 0.0625 -2.9872820 0.000000 0.0000000 0.0000000 0.000000
## 2 1 0.1875 -0.9689631 2.018319 0.5689187 0.9032588 3.133379
## 3 2 0.3125 0.9166552 3.903937 0.9454155 2.0509569 5.756918
## 4 3 0.4375 2.6695727 5.656855 1.2133258 3.2787798 8.034930
## 5 4 0.5625 4.2897894 7.277071 1.5180540 4.3017402 10.252403
## 6 5 0.6875 5.7773054 8.764587 2.0299278 4.7860021 12.743173
## 7 6 0.8125 7.1321206 10.119403 2.8520808 4.5294270 15.709378
## 8 7 0.9375 8.3542352 11.341517 4.0054526 3.4909742 19.192060
## ll.linpred ul.linpred emm_level
## 1 -2.9872820 -2.9872820 1
## 2 -2.0840232 0.1460971 1
## 3 -0.9363252 2.7696355 1
## 4 0.2914977 5.0476476 1
## 5 1.3144582 7.2651206 1
## 6 1.7987201 9.7558907 1
## 7 1.5421450 12.7220963 1
## 8 0.5036922 16.2047781 1
We can look at this fit another way, too, by plotting predictions
from the marginal structural model at all observed values of the
modifier, which gives a more complete picture of the model than the
plots at single values of the modifier. We can create smoothed scatter
plot lines (LOESS) at binned values of the modifier to informally look
at how the regression line might change over values of the modifier. We
can approximate the effect (point estimate only) at a specific value of
z by plotting a smooth fit limited a narrow range of the modifier (<
-1 or > 1). Here we see that the joint effect of exposures does
appear to differ across values of the modifier, but there is little
suggestion of non-linearity at either low or high values of the
modifier. This informal assessment agrees with intuition based on the
estimated coefficients and the standard plots from the
qgcompint
package.
library(ggplot2)
plotdata = data=data.frame(q=qfit3bnlee$index, ey=qfit3bnlee$y.expected, modifier=qfit3bnlee$emmvar.msm)
ggplot() +
geom_point(aes(x=q, y=ey, color=modifier), data=plotdata) +
geom_point(aes(x=q, y=ey), color="purple", data=plotdata[plotdata$modifier>1, ], pch=1, cex=3) +
geom_smooth(aes(x=q, y=ey), se=FALSE, color="purple", data=plotdata[plotdata$modifier>1, ], method = 'loess', formula='y ~ x') +
geom_smooth(aes(x=q, y=ey), se=FALSE, color="red", data=plotdata[plotdata$modifier < -1, ], method = 'loess', formula='y ~ x') +
geom_point(aes(x=q, y=ey), color="red", data=plotdata[plotdata$modifier < -1, ], pch=1, cex=3) +
theme_classic() +
labs(y="Expected outcome", x="Quantile score value (0 to q-1)") +
scale_color_continuous(name="Value\nof\nmodifier")
As with standard qgcomp
, the qgcompint
package allows assessment of effect measure modification for a Cox
proportional hazards model. The simdata_quantized_emm
function allows simulation of right censored survival data.
set.seed(23)
dat4 <- simdata_quantized_emm(
outcometype="survival",
# sample size
n = 200,
# correlation between x1 and x2, x3, ...
corr=c(0.8, 0.6, 0.3, -0.3, -0.3, -0.3),
# model intercept
b0=-2,
# linear model coefficients for x1, x2, ... at referent level of interacting variable
mainterms=c(0.0, -0.1, 0.1, 0.0, 0.3, 0.1, 0.1),
# linear model coefficients for product terms between x1, x2, ... and interacting variable
prodterms = c(0.1, 0.0, 0.0, 0.0, -0.2, -0.2, -0.2),
# type of interacting variable
ztype = "categorical",
# number of levels of exposure
q = 4,
# residual variance of y
yscale = 2.0
)
dat4$zfactor = as.factor(dat4$z)
head(dat4)
## z x1 x2 x3 x4 x5 x6 x7 time d zfactor
## 1 0 3 3 3 2 0 0 0 2.544399 1 0
## 2 2 3 3 3 3 3 2 0 3.419209 1 2
## 3 0 2 2 1 2 3 1 1 1.775714 1 0
## 4 2 3 3 3 3 3 3 3 4.000000 0 2
## 5 2 0 0 0 0 3 1 3 3.329749 1 2
## 6 0 0 0 0 0 1 2 3 1.557125 1 0
## 0 1 2
## 64 63 73
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.4161 1.9309 2.6441 2.6728 3.7167 4.0000
##
## 0 1
## 40 160
## x1 x2 x3 x4 x5 x6 x7
## x1 1.000 0.776 0.616 0.268 -0.296 -0.328 -0.268
## x2 0.776 1.000 0.468 0.192 -0.200 -0.316 -0.284
## x3 0.616 0.468 1.000 0.228 -0.272 -0.244 -0.284
## x4 0.268 0.192 0.228 1.000 -0.064 0.056 -0.176
## x5 -0.296 -0.200 -0.272 -0.064 1.000 0.032 0.220
## x6 -0.328 -0.316 -0.244 0.056 0.032 1.000 0.004
## x7 -0.268 -0.284 -0.284 -0.176 0.220 0.004 1.000
Fitting a Cox model with a fully linear/additive specification is
very similar to other qgcomp
“noboot” models, and the same
plots/weight estimation/effect estimation functions work on these
objects. For now, bootstrapped versions of this model are not
available.
qfit4 <- qgcomp.emm.cox.noboot(survival::Surv(time, d)~x1+x2+x3+x4+x5+x6+x7,
data = dat4,
expnms = paste0("x", 1:7),
emmvar = "zfactor",
q = 4)
## Adding main term for zfactor to the model
## ## Qgcomp weights/partial effects at zfactor = 0
## Scaled effect size (positive direction, sum of positive effects = 0.773)
## x7 x6 x5 x4 x3
## 0.3457 0.3062 0.1686 0.1063 0.0732
## Scaled effect size (negative direction, sum of negative effects = -0.142)
## x2 x1
## 0.9686 0.0314
##
##
## Mixture log(hazard ratio) (delta method CI):
##
## Estimate Std. Error Lower CI Upper CI Z value Pr(>|z|)
## psi1 0.630742 0.351180 -0.057559 1.31904 1.7961 0.07248
## zfactor1 -0.465127 0.809987 -2.052673 1.12242 -0.5742 0.56580
## zfactor1:mixture -0.060184 0.519398 -1.078185 0.95782 -0.1159 0.90775
## zfactor2 0.746395 0.797326 -0.816334 2.30913 0.9361 0.34921
## zfactor2:mixture -1.325227 0.525392 -2.354976 -0.29548 -2.5224 0.01166
## ## Qgcomp weights/partial effects at zfactor = 2
## Scaled effect size (positive direction, sum of positive effects = 0.421)
## x3 x2
## 0.83 0.17
## Scaled effect size (negative direction, sum of negative effects = -1.12)
## x6 x1 x7 x5 x4
## 0.3468 0.2427 0.2276 0.1250 0.0578
## Joint effect at zfactor=2
## Estimate Std. Error Lower CI Upper CI z value Pr(>|z|)
## Mixture -0.69448 0.38722 -1.4534 0.064445 -1.7935 0.07289
## quantile quantile.midpoint hx hr se.lnhr ll.hr ul.hr
## 1 0 0.125 -0.4651275 1.000000 0.0000000 1.0000000 1.00000
## 2 1 0.375 0.1054305 1.769254 0.3891288 0.8252075 3.79330
## 3 2 0.625 0.6759885 3.130260 0.7782575 0.6809675 14.38912
## 4 3 0.875 1.2465464 5.538224 1.1673863 0.5619395 54.58226
## emm_level
## 1 1
## 2 1
## 3 1
## 4 1
In general, t does so by asking a question that is typically not adversely impacted by high correlation/near/full non-collinearity. When possible, some methods allow the option “Bayes=TRUE,” which does perform some very light regularization and can overcome perfect collinearity in a way that allows estimation of the overall effect of exposure, but the independent effects (scaled effects, or weights) will still be subject to issues like variance inflation.
It depends. Qgcomp/qgcompint is not generally suited for problems when the number of predictors exceeds the sample size. If you try the method, it works, and you obtain reasonable narrow confidence intervals, then the method has been able to handle that number of exposures. Package authors would advocate that if you are working with a problem where the number of predictors exceeds the sample size and are wondering about this package, that you seek out a statistician with experience in those problems that can point out the challenges in such analyses.
Anything in this package that is available via bootstrapping is also available for the (newer) estimating equation approaches in this package. If your goal is just to estimate parameters, then the estimating equation approaches may be preferable because they are less computationally burdensome and prone to fewer errors like simulation error that may not be in every researcher’s area of expertise for diagnosis.
qgcomp
package?Developer time. If you see a method in the qgcomp
package that you would like to see in this package and could make a good
use-case for. Please contact the package developer/maintainer.
There is currently no simple way to implement multiple, simultaneous
modifiers in the qgcompint
package. For binary/categorical
modifiers, it is straightforward to create a single modifier with
distinct value for every unique combination of the modifiers.
The faq in the qgcomp
package vignettes
Alexander P. Keil, Jessie P. Buckley, Katie M. O’Brien, Kelly K. Ferguson, Shanshan Zhao,Alexandra J. White. A quantile-based g-computation approach to addressing the effects of exposure mixtures. https://doi.org/10.1289/EHP5838
Stevens, D.R., Bommarito, P.A., Keil, A.P., McElrath, T.F., Trasande, L., Barrett, E.S., Bush, N.R., Nguyen, R.H., Sathyanarayana, S., Swan, S. and Ferguson, K.K., 2022. Urinary phthalate metabolite mixtures in pregnancy and fetal growth: findings from the infant development and the environment study. Environment international, 163, p.107235.