$$ \LARGE Y_{si} = \gamma_0 + S_{0s} + I_{0i} + (\gamma_1 + S_{0s}) * x_i + e_{si}$$
Linear mixed effects model - an introduction
Why do we need linear mixed effect models?
|
# App for visualizing the influence of random effects in linear mixed-effect models
# Claudia Glemser, University Tuebingen, Germany
# Maria-Bianca Leonte, Jacobs University Bremen, Germany
# Malte Dewies, KU Leuven, Belgium
# Dmitri Rozgonjuk, University of Tartu, Estonia
library(shiny)
library(lme4)
library(Matrix)
library(ggplot2)
library(simgen)
ui <- fluidPage(
includeScript("../../../Matomo-tquant.js"),
titlePanel("Linear mixed effects model - an introduction", "lme models"),
tabsetPanel(
tabPanel("Welcome",
fluidRow(
column(6, align = "center", br(),
h4("Why do we need linear mixed effect models?", br(),
"The conditional independence assumption"),
br(), align ="left",
"Two events A and B are conditionally independent given a third event Y precisely if the occurrence or non-occurrence of A and B are not dependent on events in their conditional probability distribution given Y. In other words, the performance ratings after drinking stimulants and depressants are conditionally independent if they not depend on the chosen stimuli, i.e. beer or wine and coffee or Red Bull).
Failing to meet conditional independence assumptions is one of the worst statistical errors you can make but it is also one of the most poorly understood aspects of statistical analysis.
Aggregating up to the highest sampling unit without making use of the variability in the data at lower levels permits inferences on the highest level only. On the contrary, estimating variability for sample units (participants and drinks) permits full generalisation and individual differences can be explored. This is where linear mixed-effect models come into play", br(), br(), br(),
h4("What's the purpose of this app?"), br(),
"This app was designed as a help to all those students
who have been struggling with understanding linear
mixed-effect models. You can discover how
fixed and random effects influence a model's predicted
data.", br(), br(),
"Our totally real abstract on the right describes the
definitely not simulated experiment we conducted for this
app.
In the Introduction tab, you can learn more about the
mixed-effect model we used. In the 'Try it' tab, you can
play around with the data and see how changes in variance
components and fixed effect parts can change the model's
predictions.", br(), br(),
"Tip: If you want to remove a random effect completely,
just set the corresponding slider to zero", br(), br(),
"We hope you enjoy our app! Have fun!", br(), br()
),
column(6, align = "center", br(),
h4("Our totally real paper ;-)"), br(),
img(src = "Our_Paper.JPG"))
)),
tabPanel("Introduction to our mixed-effect model",
fluidRow(
column(12, align = "center",
br(),
HTML("<div>Linear mixed-effects models derive from the usual
regression formula you can see here: </div>"),
withMathJax(),
'$$ \\Large Y_{si} = \\beta_0 + \\beta_1 * x_i + e_{si}$$',
HTML("<div> β<sub>0</sub> designates the intercept,
β<sub>1</sub> the slope and therefore the influence
of x<sub>i</sub> on Y<sub>si</sub> and e<sub>si</sub> is
the residual error.</div><br>")
)
),
fluidRow(
column(12, align = "center",
withMathJax(),
'$$ \\Large \\beta_0 = \\gamma_0 + S_{0s} + I_{0i}$$',
HTML("<div> In a mixed-effects model, the intercept actually
comprises a fixed-effect part γ<sub>0</sub>,
a random-effect intercept by subject S<sub>0s</sub>
and a random-effect intercept by item I<sub>0i</sub>
</div> <br>")
)
),
fluidRow(
column(12, align = "center",
withMathJax(),
'$$ \\Large \\beta_1 = \\gamma_1 + S_{1s} $$',
HTML("<div> The slope is comprised of a fixed-effect part
γ<sub>1</sub> and a random-effect slope by
subject S<sub>1s</sub>. It can also include a
random-effect slope by item I<sub>1i</sub>, although
we did not include this in our model </div> <br><br>
<div> Put together, this yields the complete formula
for our mixed-effects model: </div>")
)
),
fluidRow(
column(12, align = "center",
withMathJax(),
'$$ \\Large Y_{si} = \\gamma_0 + S_{0s} + I_{0i} + (\\gamma_1 + S_{1s}) * x_i + e_{si}$$'
)
)
),
tabPanel("Try it!",
sidebarLayout(
sidebarPanel(
actionButton("go.gen", "Generate new dataset"),
withMathJax(),
sliderInput("gm_int",
label = HTML("Grand mean for intercept γ<sub>0</sub>"),
min = 0, max = 3, value = 0, step=0.05),
sliderInput("gm_slope",
label = HTML("Grand mean for slope γ<sub>1</sub>"),
min = 0, max = 3, value = 1, step = 0.05),
sliderInput("bs_int",
label = HTML("By subject intercept S<sub>0s</sub>"),
min = 0, max = 3, value = 1, step=0.05),
sliderInput("bs_slope",
label = HTML("By subject random slope S<sub>1s</sub>"),
min = 0, max = 3, value = 1, step=0.05),
sliderInput("bi_int",
label = HTML("By item random intercept I<sub>0i</sub>"),
min = 0, max = 3, value = 1, step=0.05),
sliderInput("resid_err",
label = HTML("Residual error variance e<sub>si</sub>"),
min = 0, max = 6, value = 1, step=0.1),
actionButton("pop.up", "Danger! Don't click here!")
),
mainPanel(
fluidRow(
column(12, align = "center",
withMathJax(),
'$$ \\LARGE Y_{si} = \\gamma_0 + S_{0s} + I_{0i} + (\\gamma_1 + S_{0s}) * x_i + e_{si}$$'
)
),
fluidRow(
column(12,
tabsetPanel(
tabPanel("full data",
plotOutput("plot.all")
),
tabPanel("subject random effects",
plotOutput("plot.preds")
),
tabPanel("beverage random effects",
plotOutput("plot.preds2")
)
)
)
)
)
)
)
)
)
server <- function(input, output){
## (1) define the functions we need
# call this function after you have re-scaled variance components
# (i.e., subj_re, item_re, or err)
# This will recreate the individual trials with the new values
resimulate <- function(olddat, # result from mkDf(verbose = TRUE)$dat
vc, # list of variance components
params) { # the generative parameters
dat2 <- olddat[, c("SubjID", "ItemID", "Cond")]
dat2$ord <- seq_len(nrow(dat2)) # remember the original order
dat3 <- merge(dat2, vc$subj_re, "SubjID")
dat4 <- merge(dat3, vc$item_re, "ItemID")
dat5 <- dat4[order(dat4$ord), ] # re-sort into original order
rownames(dat5) <- NULL
dat5$Resp <- with(dat5,
params[1] + ri.subj + ri.item + # fe-intercept
(params[2] + rs.subj + rs.item) * Cond + # fe-slope
vc$err)
dat5[, c("SubjID", "ItemID", "Cond", "Resp")]
}
rescale_variance_components <- function(x, # the data, result from call to mkDf()
sri = 1, # scaling factor for by-subject random intercept
srs = 1, # by-subject random slope
iri = 1, # by-item random intercept
irs = 1, # by-item random slope
err = 1) { # residual error
x$subj_re$ri.subj <- x$subj_re$ri.subj * sri
x$subj_re$rs.subj <- x$subj_re$rs.subj * srs
x$item_re$ri.item <- x$item_re$ri.item * iri
x$item_re$ri.item <- x$item_re$ri.item * irs
x$err <- x$err * err
x[-1]
}
## calculate model predictions for each unit/condition combination
unit_predict <- function(x, params) {
## this is the design matrix of the study
design_mx <- cbind(rs.int = 1, rs.slope = c(-.5, .5))
rownames(design_mx) <- paste("condition", c("A", "B"), sep = ".")
## matrix crossproduct to calculate predictions for each condition
## for each unit
dat <- tcrossprod(as.matrix(x[, -1]), design_mx) +
tcrossprod(cbind(rep(params["int"], nrow(x)), rep(params["eff"], nrow(x))),
design_mx) ## add in the fixed effects
## recombine in long format
res <- data.frame(rep(x[, 1], 2),
cond = rep(c("A", "B"), each = nrow(dat)),
Y = c(dat))
names(res)[1] <- names(x)[1]
res
}
# save your input data
v <- reactive({
c(int = input$gm_int,
eff = input$gm_slope, # intercept + slope (fixed effects)
err = input$resid_err, # residual error
miss = 0, pMin = 0, pMax = 0, # no missing data pmin: minimum probability
# and maximum probability of a data point
# going missing
t00 = input$bs_int, # by-subj random intercept (unit)
t11 = input$bs_slope, # by-subj random slope (unit)
rsub = 0,
w00 = input$bi_int, # by-item random intercept (unit)
w11 = 0, ritm = 0) # w11 = by-item random slope (unit)
})
gen.newdata <- function(v){
data <- mkDf(nsubj = 8, nitem = 4, mcr.params = v,
wsbi = TRUE, verbose = TRUE)
}
values <- reactiveValues(data = gen.newdata(
setNames(c(0,0,1,0,0,0,1,1,0,1,0,0), c("int", "eff", "err", "miss", "pMin",
"pMax", "t00", "t11", "rsub", "w00",
"w11", "ritm")
)
)
)
observeEvent(input$go.gen, {
values$data <- gen.newdata(v())
})
new_v <- reactive({
c(input$gm_int, input$gm_slope)
})
dat.rest <- reactive({
rescale_variance_components(values$data,
sri = input$bs_int,
srs = input$bs_slope,
iri = input$bi_int,
irs = 1, err = input$resid_err)
})
# returns all but data frame
dat.fram <- reactive({
resimulate(values$data$dat, dat.rest(), new_v())
# returns data frame
})
plot.fun <- function(data){
data$SubjID <- factor(data$SubjID, levels = c(1:8),
labels = c("Sergio", "Juergen", "Florian",
"Andrea", "Cord", "Francis", "Kathi",
"Luca"))
ggplot(data, aes(x = as.numeric(cond), y = Y)) +
# fixed effects on the background
geom_point(aes(col = SubjID), size = 3, shape = 16) +
geom_line(aes(col = SubjID, group = SubjID), size = 1) +
labs(title = "\nPerformance rating per subject\n",
x = "\nCondition", y = "Performance\n") +
theme_classic() +
theme(text = element_text(size=20),
plot.title = element_text(hjust = 0.5),
axis.text = element_text(colour = "black")) +
scale_colour_brewer(type = "seq", palette = "Set1", name ="Subjects\n") +
scale_x_continuous(limits=c(NA,NA),breaks=c(1,2),
labels=c("stimulants", "depressants"), expand=c(0.1,0.1))
}
data.plot <- reactive({
unit_predict(dat.rest()$subj_re, v())
})
output$plot.preds <- renderPlot({
plot.fun(data.plot())
})
plot.fun2 <- function(data){
data$ItemID <- factor(data$ItemID, levels = c(1:4),
labels = c("coffee", "Red Bull", "beer", "wine"))
ggplot(data, aes(x = as.numeric(cond), y = Y)) +
# fixed effects on the background
geom_point(aes(col = ItemID), size = 3, shape = 16) +
geom_line(aes(col = ItemID, group = ItemID), size = 1) +
labs(title = "\nPerformance rating by beverage\n",
x = "\nCondition", y = "Performance\n") +
theme_classic() +
theme(text = element_text(size=20),
plot.title = element_text(hjust = 0.5),
axis.text = element_text(colour = "black")) +
scale_colour_brewer(type = "seq", palette = "Set1", name ="Beverage\n") +
scale_x_continuous(limits=c(NA,NA),breaks=c(1,2),
labels=c("stimulants", "depressants"),
expand=c(0.1,0.1))
}
data.plot2 <- reactive({
unit_predict(dat.rest()$item_re, v())
})
output$plot.preds2 <- renderPlot({
plot.fun2(data.plot2())
})
plot.all <- function(data){
data$Cond <- factor(data$Cond, labels = c("stimulants", "depressants"))
ggplot(data, aes(Cond, Resp)) +
scale_colour_brewer(type = "seq", palette = "Set1") +
geom_jitter(aes(color = Cond)) +
geom_violin(aes(fill = Cond, alpha = .2, col = NA)) +
# scale_y_continuous(limits = c(-4, 4), breaks = seq(-4, 4, 0.50)) +
labs(title = "\nAll data points per condition\n",
x = "\nCondition", y = "Performance\n") +
theme_classic() +
theme(text = element_text(size=20),
plot.title = element_text(hjust = 0.5),
axis.text = element_text(colour = "black"),
legend.position="none")
}
output$plot.all <- renderPlot({
plot.all(dat.fram())
})
# Easter egg
observeEvent(input$pop.up, {
showModal(modalDialog(
title = "Some Dale love from the developer team - Cuteness overload!",
img(src = "Dale_Love.gif"),
footer = modalButton("Dismiss"),
size = "m", easyClose = FALSE, fade = TRUE, align = "center"
)
)
})
}
shinyApp(ui = ui, server = server)