PowerRangers simulation appStatistical power Power represents the ability of a statistical test to detect an effect that is real. It depends on four main elements: - the type I error (i.e., the criterion which determinates how unlikely a positive result must be for the null hypothesis to be rejected) - the effect size (i.e., the magnitude of the effect of interest in the population) - the sample size used to detect the effect (i.e. which determinates the amount of sampling error inherent in the test results) - the variability (i.e., the standard deviation of the sample) Why power matters Estimating the statistical power (i.e., power analysis) could be very useful before collecting the data. The power analysis allows you to decide: - how large a sample is needed to enable statistical judgments that are accurate and reliable - how likely your statistical test will be to detect effects of a given size in a particular situation. How to estimate your power Power could be estimated analytically, by setting all parameters that affect it. It could be estimated by simulating data, performing statistical tests on them and calculating the proportion of p-values that fall below the chosen significance criterion. What is power?While we are always focusing on Type I error (i.e., false positive), we frequently ignore that by decreasing alpha the power becomes lower. The power is determined by subtracting the Type II error (i.e., false negative) from 1. It is the probability (i.e., from 0 to 1) to correctly reject H0 when it is false. In other worlds, the power is the ability of a test to detect an effect that is real.Power estimation in one sample t-testThe statistical power could be estimated by simulating data. It requires the following steps:1. Specify randomization procedure (e.g. normally distributed random data), the test statistic that is used, and the significance level (i.e., alpha) that will be used. 2. Generate random samples from the distributions specified by the null hypothesis. 3. Calculate each test statistic from the simulated data and determine if the null hypothesis is accepted or rejected. 4. Tabulate the number of rejections and use this to calculate the test's significance level. 5. Repeat steps 2 and 3 several thousand times, tabulating the number of times the simulated data leads to a rejection of the null hypothesis. The significance level is the proportion of simulated samples in step 2 that lead to rejection. Here, the power is the proportion of simulated samples that lead to rejection of H0. By setting the mean and the standard deviation of the simulated samples, and the mean under the null hypotesis and the significance criterion, you can estimate the power for a one sample t-test. Note that as the number of simulations is increased, the precision of the simulation will be increased also. Power estimation in two samples t-testHere you want to estimate the power of your test statistical test in detecting true differences among the means of two samples. You need to set the size, the mean and the standard deviation of both groups.Power estimation in one-way Analysis of VarianceFinally, let's complicate a bit the situation! Now the groups are three and you are performing a one-way ANOVA, by testing the effect of a factor with three levels (e.g., treatment, placebo, nothing). |
library(shiny)
# setwd("C:/Users/Tquant/Desktop/PowerRangers")
ui <- fluidPage(
includeScript("../../../Matomo-tquant.js"),
titlePanel("PowerRangers simulation app"),
tabsetPanel(
##################################################################### INTRODUCTION
tabPanel("Introduction",
sidebarLayout(
sidebarPanel(
img(src='powerRangers.png', align = "center"),
HTML("<b><br>Let the power be in you!</b>
<br>
This shiny app aims to help students to understand what is statistical power and to estimate it though simulated data
The app allows the user to perform simulations with common used tests as one and two sample t-test and one way ANOVA.
")
),
mainPanel(
HTML("<br>
<b>Statistical power</b>
<br>Power represents the ability of a statistical test to detect an effect that is real.
It depends on four main elements:
<br><b>- the type I error</b> (i.e., the criterion which determinates how unlikely a positive result must be
for the null hypothesis to be rejected)
<br><b>- the effect size</b> (i.e., the magnitude of the effect of interest in the population)
<br><b>- the sample size used to detect the effect</b> (i.e. which determinates the amount of sampling error inherent in the test results)
<br><b>- the variability</b> (i.e., the standard deviation of the sample)
<br><br>
<b>Why power matters</b>
<br>Estimating the statistical power (i.e., power analysis) could be very useful <i>before</i> collecting the data.
<br>The power analysis allows you to decide:
<br>- how large a sample is needed to enable statistical judgments that are accurate and reliable
<br>- how likely your statistical test will be to detect effects of a given size in a particular situation.
<br><br>
<b>How to estimate your power</b>
<br>Power could be estimated analytically, by setting all parameters that affect it.
It could be estimated by simulating data, performing statistical tests on them and calculating the proportion
of p-values that fall below the chosen significance criterion.")
)
)
),
##################################################################### WHAT IS POWER
tabPanel("What is power?",
sidebarLayout(
sidebarPanel(
HTML("<br>
Change the value of Alpha and see how the Power changes!
<br><br>"),
sliderInput("Alpha", "Alpha", .001,.999,0.05),
verbatimTextOutput("indices"),
HTML("<br>
Let the red cross be your sample mean. Whithin the NHST approach, you want to know if its mean is different from a given value.
Thus, you perform a statistical test in order to make a decision on rejecting or not the null hypotesis (i.e., no difference)
<br><br>")
),
mainPanel(
h3("What is power?"),
HTML("While we are always focusing on <b>Type I error</b> (i.e., false positive), we frequently ignore
that by decreasing alpha the power becomes lower. The power is determined by subtracting the
<b>Type II error</b> (i.e., false negative) from 1.
It is the probability (i.e., from 0 to 1) to correctly reject H0 when it is false.
In other worlds, the power is the ability of a test to detect an effect that is real.
<br>"),
plotOutput("powerplot")
)
)
),
###################################################################### ONE SAMPLE T-TEST SIMULATION
tabPanel("One sample t-test",
sidebarLayout(
sidebarPanel(
h4("Sample size"),numericInput("n","", 20, min = 5, max = 200),
h4("Hypothesized mean"),sliderInput("mu1", "", 400, 420, c(405)),
h4("True mean"),sliderInput("mu2", "", 400, 420, c(410)),
h4("Standard deviation"),sliderInput("sd", "", 1, 20, 8),
h4("Alpha"),radioButtons("alpha", "", c(".01", ".05", ".10"),
selected = ".05", inline = TRUE),
h4("Number of replicates"),radioButtons("nrep", "", c("500", "1000", "5000"),
selected = "1000", inline = TRUE),
actionButton("simulate", "Simulate power")
),
mainPanel(
h3("Power estimation in one sample t-test"),
HTML("
The statistical power could be estimated by simulating data. It requires the following steps:
<br>1. Specify randomization procedure (e.g. normally distributed random data), the test statistic that is used,
and the significance level (i.e., alpha) that will be used.
<br>2. Generate random samples from the distributions specified by the null hypothesis.
<br>3. Calculate each test statistic from the simulated data and determine if the null hypothesis is accepted or rejected.
<br>4. Tabulate the number of rejections and use this to calculate the test's significance level.
<br>5. Repeat steps 2 and 3 several thousand times, tabulating the number of times the simulated data leads to a rejection
of the null hypothesis. The significance level is the proportion of simulated samples in step 2 that lead to rejection.
<br><br> Here, the power is <b>the proportion of simulated samples that lead to rejection of H0</b>.
<br> By setting the mean and the standard deviation of the simulated samples, and the mean under the null hypotesis and the
significance criterion, you can estimate the power for a one sample t-test.
<br> Note that as the number of simulations is increased, the precision of the simulation will be increased also."),
plotOutput("histplot"),
verbatimTextOutput("groupdata"),
tags$style(type='text/css',
'#groupdata {background-color: white; color: black;
font-size: 14px;
font-style: regular;}')
)
), value = "One sample t-test"
),
######################################################################## TWO SAMPLE T-TEST SIMULATION
tabPanel("Two samples t-test",
sidebarLayout(
sidebarPanel(
h4("Sample Size"),
numericInput("tn1", "Group 1", 20, min = 5, max = 200),
numericInput("tn2", "Group 2", 20, min = 5, max = 200),
h4("Means"),
sliderInput("tmu1", "Group 1", 400, 420, c(405)),
sliderInput("tmu2", "Group 2", 400, 420, c(405)),
h4("Standard Deviations"),
sliderInput("tsd1", "Group 1", 1, 20, 8),
sliderInput("tsd2", "Group 2", 1, 20, 8),
h4("Alpha"),
(radioButtons("talpha", "",
c(".01", ".05", ".10"),
selected = ".05", inline = TRUE)),
h4("Number of replicates"),
radioButtons("nrep","",
c("500", "1000", "5000"),
selected = "1000", inline = TRUE),
actionButton("tsimulate", "Simulate power")
),
mainPanel(
h3("Power estimation in two samples t-test"),
HTML("
Here you want to estimate the power of your test statistical test in detecting true differences among the means of two samples.
You need to set the size, the mean and the standard deviation of both groups.
<br><br>"),
plotOutput("thistplot"),
verbatimTextOutput("group1data"), # GROUP 1 DATA
tags$style(type='text/css',
'#group1data {background-color: white; color: black;
font-size: 14px;
font-style: regular;}'),
verbatimTextOutput("group2data"), # GROUP 2 DATA
tags$style(type='text/css',
'#group2data {background-color: white; color: black;
font-size: 14px;
font-style: regular;}')
)
)
),
######################################################################################### ONE WAY ANOVA SIMULATION
tabPanel("One-way ANOVA",
sidebarLayout(
sidebarPanel(
h4("Sample Size"),
numericInput("an1", "Group 1", 20, min = 5, max = 200),
numericInput("an2", "Group 2", 20, min = 5, max = 200),
numericInput("an3", "Group 3", 20, min = 5, max = 200),
h4("Means"),
sliderInput("amu1", "Group 1", 400, 420, c(405)),
sliderInput("amu2", "Group 2", 400, 420, c(405)),
sliderInput("amu3", "Group 3", 400, 420, c(405)),
h4("Standard Deviation"),
sliderInput("s1", "Group 1", 1, 20, 8),
sliderInput("s2", "Group 2", 1, 20, 8),
sliderInput("s3", "Group 3", 1, 20, 8),
h4("Alpha"),
radioButtons("aalpha", "",
c(".01", ".05", ".10"),
selected = ".05", inline = TRUE),
h4("Number of replicates"),
radioButtons("anrep","",
c("500", "1000", "5000"),
selected = "1000", inline = TRUE),
actionButton("asimulate", "Simulate power")
),
mainPanel(
h3("Power estimation in one-way Analysis of Variance"),
HTML("
Finally, let's complicate a bit the situation! Now the groups are three and you are performing a one-way ANOVA,
by testing the effect of a factor with three levels (e.g., treatment, placebo, nothing).
<br><br>"),
plotOutput("ahistplot"),
verbatimTextOutput("agroup1data"), # GROUP 1 DATA
tags$style(type='text/css',
'#agroup1data {background-color: white; color: black;
font-size: 14px;
font-style: regular;}'),
verbatimTextOutput("agroup2data"), # GROUP 2 DATA
tags$style(type='text/css',
'#agroup2data {background-color: white; color: black;
font-size: 14px;
font-style: regular;}'),
verbatimTextOutput("agroup3data"), # GROUP 3 DATA
tags$style(type='text/css',
'#agroup3data {background-color: white; color: black;
font-size: 14px;
font-style: regular;}')
)
)
)
) ################################ end of TabSetPanel
) ######################################### end of UI
server <- function(input, output){
################################################################################# INTRODUCTION
output$powerRangers_image <- renderImage({
outfile <- tempfile(fileext='powerRangers.png')
png(outfile, width=200, height=200)
dev.off()
list(src = outfile,
alt = "This is alternate text")
}, deleteFile = TRUE)
################################################################################### VISUALIZATION
crit <- reactive({
qnorm(1-as.numeric(input$Alpha)) # setting the critical value
})
output$powerplot <- renderPlot({
par(mai=c(1,0,.5,0),mgp=c(0.5,0,0))
# distribution under H0
curve(dnorm(x,0,1),xlim=c(-3,6),
bty="n",xaxt="n",yaxt="n",lwd=2,
ylab='',xlab='')
# distribution under H1
curve(dnorm(x,3,1),lwd=2,add=TRUE)
# type II error area
polygon(c(0,seq(0,6,0.01),6), #
c(0,dnorm(mean=3,seq(0,6,0.01)),0),
col="#004987")
# power area
polygon(c(crit(),seq(crit(),10,0.01),10),
c(0,dnorm(mean=3,seq(crit(),10,0.01)),0),
col="#0BAD4D")
# type I error area
polygon(c(crit(),seq(crit(),4,0.01),4),
c(0,dnorm(seq(crit(),4,0.01)),0),
col=rgb(1,0,0,alpha=0.3))
# rejecting line and sample point
abline(v=crit(),col="red",lwd=3)
text(x=crit()+1,y=0.1,labels="Rejecting H0",cex=1)
text(x=crit()-1.3,y=0.1,labels="Not rejecting H0",cex=1)
points(2,0,col="darkred",pch=4,cex=3,lwd=3)
# legend
legend(-3, 0.4,
legend=c(expression(paste("Type I error ( ",alpha," )")),
expression(paste("Type II error ( ",beta," )")),
expression(paste("Power ( 1-",beta," )"))),
fill=c("#FFB2B2","#004987","#0BAD4D"),
cex=1.2)
})
output$indices <- renderText(paste("Alpha =",input$Alpha,
"\nBeta =",round(pnorm(crit(),mean=3),2),
"\nPower =",round(1-pnorm(crit(),mean=3),2),
"\nConclusion:\n",
if(crit()<2){"Rejecting H0"}
else{"Not rejecting H0"}))
#################################################################################### ONE SAMPLE T-TEST
# calculates the pval only by pressing "simulate"
pval <- eventReactive(input$simulate, {
# determinates the number of repilcates
replicate(as.numeric(input$nrep),
# execute a t test with N=n,mean=mu1 and SD=sd for each replicate
t.test(rnorm(input$n, input$mu2, input$sd),
# returns the p-value
mu = input$mu1)$p.value)
})
power <- reactive({
mean(pval() < as.numeric(input$alpha))
})
output$histplot <- renderPlot({
par(mai = c(.6, .6, .3, .1), mgp = c(2, .7, 0))
hist(pval(), breaks=seq(min(pval()), max(pval()), length.out=10), col = "lightblue", border = "white", xlim = 0:1,
xlab = "P-value", main = paste("Power =", power()))
abline(v = input$alpha, col = "darkblue")
box(bty = "L")
})
groupdata <- eventReactive(input$simulate,
{
paste(
"Sample size:", input$n,
"\nHypothesized mean:", input$mu1,
"\nTrue mean:", input$mu2,
"\nStandard deviation:", input$sd)
}
)
output$groupdata <- renderText({groupdata()})
########################################################################### Two sample t-test
## Sample data ##
tmu1 <- reactive(input$tmu1)
tsd1 <- reactive(input$tsd1)
tn1 <- reactive(input$tn1)
tmu2 <- reactive(input$tmu2)
tsd2 <- reactive(input$tsd2)
tn2 <- reactive(input$tn2)
## p.value and power calculation ##
tpval <- eventReactive(input$tsimulate, {
replicate(as.numeric(input$nrep),
t.test(rnorm(tn1(),tmu1(),tsd1()),
rnorm(tn2(), tmu2(), tsd2()))$p.value)
})
tpower <- reactive({
mean(tpval() < as.numeric(input$talpha))
})
output$thistplot <- renderPlot({
par(mai = c(.6, .6, .3, .1), mgp = c(2, .7, 0))
hist(tpval(), col = "lightblue", border = "white", xlim = 0:1,
xlab = "P-value", main = paste("Power =", tpower()))
abline(v = input$talpha, col = "darkblue")
box(bty = "L")
})
group1 <- eventReactive(input$tsimulate, # Group 1 data input
{
paste(
"Group 1",
"\nSample size:", input$tn1,
"\nMean:", input$tmu1,
"\nStandard deviation:", input$tsd1)
}
)
output$group1data <- renderText({group1()})
group2 <- eventReactive(input$tsimulate, # Group 2 data input
{
paste(
"Group 2",
"\nSample size:", input$tn2,
"\nMean:", input$tmu2,
"\nStandard deviation:", input$tsd2)
}
)
output$group2data <- renderText({group2()})
apval <- eventReactive(input$asimulate, { ##################################### ONE WAY ANOVA
an1 <- input$an1; an2 <- input$an2; an3 <- input$an3
amu1 <- input$amu1; amu2 <- input$amu2; amu3 <- input$amu3
s1 <- input$s1; s2 <- input$s2; s3 <- input$s3
replicate(60, {
mdata <- data.frame(y = c(rnorm(an1, amu1, s1),
rnorm(an2, amu2, s2),
rnorm(an3, amu3, s3)),
group = rep(c("one","two","three"),
times=c(an1, an2, an3)))
summary(aov(y ~ group, data = mdata))[[1]]$"Pr(>F)"[1]
}
)
})
apower <- reactive({
mean(apval() < as.numeric(input$aalpha))
})
output$ahistplot <- renderPlot({
par(mai = c(.6, .6, .3, .1), mgp = c(2, .7, 0))
hist(apval(), col = "lightblue", border = "white", xlim = 0:1,
xlab = "P-value", main = paste("Power =", apower()))
abline(v = input$aalpha, col = "darkblue")
box(bty = "L")
})
agroup1 <- eventReactive(input$asimulate, # Group 1 data input
{
paste(
"Group 1",
"\nSample size:", input$an1,
"\nMean:", input$amu1,
"\nStandard deviation:", input$s1)
}
)
output$agroup1data <- renderText({agroup1()})
agroup2 <- eventReactive(input$asimulate, # group 2 data input
{
paste(
"Group 2",
"\nSample size:", input$an2,
"\nMean:", input$amu2,
"\nStandard deviation:", input$s2)
}
)
output$agroup2data <- renderText({agroup2()})
agroup3 <- eventReactive(input$asimulate, # Group 2 data input ###
{
paste(
"Group 2",
"\nSample size:", input$an3,
"\nMean:", input$amu3,
"\nStandard deviation:", input$s3)
}
)
output$agroup3data <- renderText({agroup3()})
}
shinyApp(ui, server)