WHAT IS THIS ALL ABOUT?

With this app you can explore basic statistics concepts, compare different models and see how they all boil down to the same recipe:

Data = model + error

The curious human being that's inside yourself asks you questions all the times about phenomena, and you really would like to satisfy your unstoppable thirst for knowledge, somehow. We have a tool that might help you out.

YOUR FIRST RESEARCH QUESTION

Imagine soon a marathon will take place close where you live, and all your friends are taking part in it. You would like to join them, but you don't want to suck. So you decide to look around for what is important in order to succeed in a loooong run. You have some ideas, maybe your BMI is important? Maybe your height? Or is it more about how much you have been training?

You are lucky, the organizers of last year marathon collected information, and you can access this dataset! But what can you do with all these numbers?

THE DATASET

So, let's first have a look at the information about the last year's marathon. Here is a challenge for you! Based on the dataset, try to answer these questions:

1. How many people took part in the marathon last year?

2. How much time did it take for participant number seven to complete the run?

3. How many hours a week did she train?

What have you learned so far?
The basic recipe for any model is:

Data = model + error

By now you should be familiar with the first part of the recipe (DATA):
►You gained some insights into your research question and how to go to answer it.
►You have seen the available data, and briefly explored it. But this is just the first step!

What are you going to learn next?
► You are going to get to know the second part of the recipe: what is a MODEL? What is ERROR?
► What is the null model?
► What are the parameters of the null model?

Research question:
You aim at understanding what you should do to prepare for the marathon. A couple of your friends participated last year, and you decide to get some inspiration by looking at what they did.

YOUR FIRST GUESS

You are curious about how your friends Tom and Jerry performed last year, but you can't find their name in the dataset in which only participants'numbers are displayed. You don't have many information about them apart from the fact that they were there. What is the best guess you can make about their time?
You can now use your recipe to answer this question. The 'i' in brackets indicates the individual partcipant.

Data(i) = model + error(i)

The best guess would probably be the most frequent time people spent to complete the marathon. In statistical language this is called 'the mode', and in special cases (e.g. when data are normally distributed), the mode equals the mean value. Let's assume that this is the case, you calculate the mean time and you find that it was about 4 hours and 15 minutes, so you go for that.

Time (Jerry) = mean time + error(Jerry)

Time (Tom) = mean time + error(Tom)

When you meet your friends, you decide to actually ask them for the real time. Jerry seems to be quite offended by your guess, he had performed way better! He made it in a little more than 2 hours and a half!! However, Tom was happily surprised by your guess, it took 4 hours and 9 minutes for him! Maybe you can improve your guess (model) by considering more information and adding it into your model.

WHAT IS AN ERROR?
Your prediction was better for Tom than for Jerry, so we can say that you made a greater error at predicting Jerry's time than Tom's. You can look at the next section to get a better grasp of what an error is. Look at the graph below. On the y-axis is represented last year participant's time. On the x-axis you find the participant number (id). Your model (the mean) is represented as a straight line, and it is the time that you would predict for each participant to finish the marathon. For a lot of participants, you made an error using this model. The red line represents the error you commited in your prediction for Jerry's time, whereas the green line is the error associated with the time you predicted for Tom.

As you can see the model is not that good. Maybe you can improve your guess (model) by considering more information to add into your model.

What have you learned so far?
By now you should be familiar with the null model and the error in the model.

The basic recipe for any model is:

Data(i) = model + error(i)

What are you going to learn next?
► What is the proposed model?
► What are the parameters of the proposed model?
► How good is the model?

Research question:
Don't forget our research question! You want to understand what is important in order to succeed in a marathon.

As you saw in the previous section, your model was not very good at predicting your friends'performance. How can you improve it?
In the dataset of last year's marathon, you found other information which might be useful, such as the hours a week each participant spent on training, BMI and their height.

You decide to start by looking if the hours spent on training might be relevant for the performance in the marathon. You can now build a new model with this information and see if your predictions improve:

Marathon Finish Time = Intercept + effect of training * training + Error

Yi = b0 + b1*x1 + Error

Yi= outcome variable,
B0 = intercept or b0 coefficient
B1 = slope or b1 coefficient
X1 = predictor variable

Intercept You assume that the relationship between the two variables (time and training) can be described with a line. A person who did not train at all, has a value of 0 hours on the x-axis, and according to this model she will have to run for 7265 minutes to finish the marathon. This value is known as the intercept (b0) in the model and is represented by the point in which the line crosses the y-axis.
Effect of training (b1) You would like to see if training is important for succeeding in the marathon. The answer to this question lies in the b1 coefficient. This coefficient (slope of the line) represents the strength and the direction of the relationship between the two variables, in particular it represents how much the predictor has to change in order to see a change of one unit in the outcome.

In this page you can visually compare the length of the lines. The model seems to better predict both the times of your friends (the error is reduced). But most of all, it seems that the predictor 'Hours of training a week' is significantly related to the outcome variable (performance at the marathon). To know if that's the case you will have to compute a significance test, which you will find in the inference section.
At this point you are interested to know if your predictions improved. What about comparing the null model with this new one? Go to the the 'Model comparison' section and have a look.

Ok, so seems like hours you've spent training is a pretty good predictor for your marathon finish time!
Those who spend more time training are faster and have a smaller finish time for the marathon.
Jerry was more dedicated to the training than Tom, so he finished earlier.

But what about the other variables?
Do height and body mass index tell you something about the time people need to finish the marathon?
You still have more data to look at!

Click on the tabs 'Height' and 'BMI' to look at the other variables.

When you use the participants height to estimate their marathon finishing time, the model looks like this:

Marathon Finish Time = Intercept + effect of height * height + Error

which translates to:

$$Y_{i} = \beta_{0.2} + \beta_{1.2} * x_{2i} + e_{i}$$

Oh no! This doesn't look good! Your estimate for Jerry is worse again! And the errors seem to also be bigger.
Let's look if BMI is a better estimator for marathon finishing time!

Click on the next tab on top called 'BMI'

When you use the participants BMI to estimate their marathon finishing time, the model looks like this:

Marathon Finish Time = Intercept + effect of BMI * BMI + Error

which translates to:

$$Y_{i} = \beta_{0.3} + \beta_{1.3} * x_{3i} + e_{i}$$

When you use the BMI for predicting the marathon performance, your estimation for Jerry is better than in the Null Model! However, the estimation for Tom got a little worse...

You should probably stick to using the hours put into training for your model!

What have you learned so far?
By now you should be familiar with the null model and various models with one variable as a predictor.

What are you going to learn next?
► Compare the null model with the proposed model
► Understand how to evaluate the model

Research question:
Don't forget our research question! You want to understand what is important in order to succeed in a marathon.

MODEL COMPARISON

In the graph you can now see the two models:
►red line: proposed model,
►blue line: null model.
Which one of the two models allows you to make better predictions?

You can answer this question more precisely if you are able to quantify which model has the least error. You can take each participant's time and subtract the predicted time based on your model. You do this for each participant, and sum up the results. What do you obtain?

The result is zero! This does not help, in fact to avoid cancelling out opposite values and in order to give each one a weight, we have to square each difference before summing them up. At this point, we have the so called 'sum of squared errors (SSE)', which we need to divide for the number of participants to obtain the 'mean sum of squared errors' (MSSE).
To compare how good each model is, we need to calculate the so called 'R-squared' which is the ratio between the explained variation that the model explains (SSM) and the total variation (sum of SSE and SSM).

What you have learned so far:

► The research question: How can I predict performance in a marathon?

► The basic equation for model analysis: Data = Model + Error;

► Having a look at the raw data.

► The proposed model, in which we try to use hours of training to predict marathon performance.

► The parameters of the proposed model: The intercept (b0), which represents the performance of a runner with zero hours of training, and the slope (b1), which represents the gain in performance for each increment of 1 hour of training.

► The error of the proposed model: We calculate the Sum of Squared Errors in order to have a measure the amount of error we make in our predictions using the proposed model.

► The null model: Calculating the simplest model possible, which is just using mean performance to guess performance.

► The paramter of the null model: The intercept (b0), which represents mean performance.

► The error of the null model: We again calculate the Sum of Squared Errors in order to have a measure of the amount of error we make in our predictions using the null model.

► The proportion of the reduction of the error (PRE): We then see the proportion of error of the null model that we're able to reduce by using the proposed model instead of the null model.

What you will learn here:

► Test statistics;

► p-value;

► Hypothesis testing;

► Data reporting in APA style.

Test statistic = Estimated effect / Error of the estimation

This is the basic equation underlying test statistics. Test statistics basically represent an effect in the unit if a specific theoretical distribution (here, T-Student and F distributions). When we compute a test statistics we can evaluate how likely that value (or a more extreme value) is to occur. And with this we can say if an effect is happening by chance or not.

Here's how R would print the results of our regression.

How would you report your results (APA style)? The important effect here is given by the b1 parameter. So the report of the results will be on this parameter. Two different ways to do this:

What about R2? Shouldn't we be reporting R2 as well?

REGRESSION vs ONE-WAY ANOVA

Now we wil explore two cases of General Linear Models: a simple regression and a one-way ANOVA

A very simple explanation is the following:
►Regression is the statistical model that you use to predict a continuous outcome on the basis of one or more continuous predictor variables
► ANOVA is the statistical model that you use to predict a continuous outcome on the basis of one or more categorical predictor variables.

Similarities

1. First, both models are applicable only when you have a continuous outcome variable. A categorical outcome variable would rule out the use of either a regression model or an ANOVA model.

2. Second, you can use the regression algorithm, which is based on the principle of least squares, to fit an ANOVA model. You don't have to use the least squares principle because there are other ways to produce the ANOVA model. But because least squares, the basis for regression models, also works for ANOVA models

3. Third, the concept of partitioning variation into sums of squares (SS) in an ANOVA model also provides a nice way to examine complex regression models.

In an ANOVA:
the categorical variable is effect coded, which means that each categosy's mean is compared to the grand mean

In a Regression:
the categorical variable is dummy coded, which means that each category's intercept is compared to the reference group's intercept. Since the intercept is defined as the mean value when all the other predictors are equal to zero, and there are no other predictors, the three intercepts are just means.

Look at the outputs below

The intercept in the regression model is simply the mean of the reference group (i.e. the American; see the ANOVA table). The coefficients for the other two groups are the differences in the mean between the reference group and the other groups (see the computations in the Regression table)

The ANOVA reports each mean and a p-value that shows whether at least two of these means are significantly different. Similarly, a regression reports only one mean (as an intercept) and the difference between that one mean and all other means, with p-values evaluating these specific comparisons.

Both the F- and p-values for the ANOVA and Regression model are identical (34.54 and <0.001 respecitvely)

$$R^2=\frac{SSm}{SSt}=\frac{39,279}{54,632}=0.719$$

Differences

ANOVA is a tool to check how much the residual variance is reduced by predictors in (nested regression) models, whereas the regression analysis aims to quantify effect sizes in terms of 'how much is the response expected to change when the predictor(s) change by a given amount?'
►For categorical predictors this reduces to the question to 'what is the expected difference in the response between different groups/categories?'
► For continuous predictors this is the questions for a slope

To clarify: ANOVA can be applied to any regression model (no matter if the model contains only continuous, only categorical, or both kinds of predictors). ANOVA allows to assess the impact of a predictor or a whole set of predictors on the residuals: who much of the variance in the data can be explained by these predictors? The regression analysis, on the other hand, is a complementary tool to asses the quantitative relation between a predictor and the response.

ANOVA Table

Regression Table

ANOVA Output from R

Regression Output from R

Coming Soon

#####PACKAGES########## library(shiny) library(ggplot2) library(calibrate) library(MASS) library(shinyBS) library(broom) library(DT) library(shinydashboard) #######SERVER############### server = function(input, output, session) { ######## GENERATE DATA ######## ## Set a (lot of) seed(s) to obtain the same data everytime set.seed(1234) y <- round(rnorm(30, 270, 48)) # marathon performance id <- 1:30 ## Set while loops to find out which seed to use in order to obtain the ## necessary data for our pre-defined correlation values # i <- 0 # number used for the seed # rho <- 0 # correlation # while(rho != -0.75){ # # i <- i+1 # set.seed(i) # rho <- round(cor(y, rnorm(30, 45, 5)),2) # # } # i # shows which seed achieved a correlation of -0.75 set.seed(220851) x1 <- round(rnorm(30, 45, 5),2) # hours of training # i <- 0 # number used for the seed # rho <- 0 # correlation # while( rho != 0.07){ # # i <- i+1 # set.seed(i) # rho <- round(cor(y, rnorm(30, 180, 10)), 2) # # } # i set.seed(5) x2 <- round(rnorm(30, 180, 10), 2) # body height #i <- 0 # number used for the seed #rho <- 0 # correlation #while( rho != .42){ # # i <- i+1 # set.seed(i) # rho <- round(cor(y, rnorm(30, 22, 2)),2) # #} #i set.seed(1589) x3 <- round(rnorm(30, 22, 2), 2) # bmi # Now, bind them in one dataframe df <- data.frame(cbind(id, y, x1, x2, x3)) # create a fourth variable to use in the ANOVA vs regression ethnicity <- rep(c("European", "American", "Asian"), each=10) # first create the vector df <- df [order (df$y),] df$x4 <- ethnicity # sort it by response and add the ethnicity vector so we can guarantee # an effect df <- df [order (df$id),] # for aesthetic purposes sort the dataframe back by id # Now we define some labels to our variables idlabel <- "Participant" ylabel <- "Marathon Finish Time [minutes]" x1label <- "Hours of Training" x2label <- "Height" x3label <- "Body Mass Index" x4label <- "Ethnicity" ######## GET USER CHOICES ######## rv <- reactiveValues(getmodel = -1) observeEvent(input$data, {rv$getmodel <- 0 }) observeEvent(input$null, {rv$getmodel <- 1 }) observeEvent(input$full, {rv$getmodel <- 2 }) observeEvent(input$testing, {rv$getmodel <- 3 }) getplot <- reactive(switch(input$plot, scatter = "0", hist = "1")) getplotS <-reactive(switch(input$plots, tstat = "0", fstat = "1")) ######## PREPARING ALL VARIABLES THAT WILL BE CALLED IN THE SERVER ######## #### NULL AND X MODELS #### ## x model xmod <- lm(df$y ~ df$x1) xmod.2 <- lm(df$y ~ df$x2) xmod.3 <- lm(df$y ~ df$x3) ##get the summary outputs into a data.frame xmodel <- tidy(xmod) xmodel.2 <- tidy(xmod.2) xmodel.3 <- tidy(xmod.3) colnames(xmodel) <- (c("Term", "Estimate", "Std. Error", "t value", "p-value")) rownames(xmodel) <- (c("Intercept (b0)", "Slope (b1)")) colnames(xmodel.2) <- (c("Term", "Estimate", "Std. Error", "t value", "p-value")) rownames(xmodel.2) <- (c("Intercept (b0)", "Slope (b1)")) colnames(xmodel.3) <- (c("Term", "Estimate", "Std. Error", "t value", "p-value")) rownames(xmodel.3) <- (c("Intercept (b0)", "Slope (b1)")) # null model - mean of y (our dv) ## error or residuals for the null model Ymodelerror <- round(df$y - mean(df$y), 2) #### PARAMETERS ESTIMATION #### #b1, which is also our slope b1 <- cor(df$x1, df$y) * (sd(df$y) / sd(df$x1)) b1.2 <- cor(df$x2, df$y) * (sd(df$y) / sd(df$x2)) b1.3 <- cor(df$x3, df$y) * (sd(df$y) / sd(df$x3)) #or we can just extract it from the data.frame that we created which contains the lm model output ## b1.1 <- xmodel.1[2,2] #and b0, our intercept b0 <- mean(df$y) - b1 * mean(df$x1) b0.2 <- mean(df$y) - b1.2 * mean(df$x2) b0.3 <- mean(df$y) - b1.3 * mean(df$x3) #similarly here we can just extract it b0.1 <- xmodel.1[1,2] #### PROPOSED MODEL ERROR #### ## so, now we can calculate both the prediction and, hence, the proposed model error: prediction <- round(b0 + df$x1*b1, 2) fullmodelerror <- round(residuals(xmod), 2) prediction.2 <- round(b0.2 + df$x2*b1.2, 2) fullmodelerror.2 <- round(residuals(xmod.2), 2) prediction.3 <- round(b0.3 + df$x3*b1.3, 2) fullmodelerror.3 <- round(residuals(xmod.3), 2) ## sum of squared errors for all models, null, x, and proposed: SSEnull <- sum(Ymodelerror^2) #ssex <- sum(Xmodelerror^2) SSEfull <- sum(fullmodelerror^2) SSEfull.2 <- sum(fullmodelerror.2^2) SSEfull.3 <- sum(fullmodelerror.3^2) ## sum of squared residuals: SSR <- sum(prediction - mean(df$y))^2 SSR.2 <- sum(prediction.2 - mean(df$y))^2 SSR.3 <- sum(prediction.3 - mean(df$y))^2 SS <- c(SSEnull, SSEfull, SSEnull - SSEfull) SS.2 <- c(SSEnull, SSEfull.2, SSEnull - SSEfull.2) SS.3 <- c(SSEnull, SSEfull.3, SSEnull - SSEfull.3) #### DATA ANALYSIS #### #here, we are calculating both the t tests for our parameters and the f statistics for the proposed model #these will be needed for the statistical inference tab ##B0 t-test statistics #we need the standard error of the full model se <- sigma(xmod) se.2 <- sigma(xmod.2) se.3 <- sigma(xmod.3) #the standard error of b0 seb0 <- xmodel[1, 3] seb0.2 <- xmodel.2[1, 3] seb0.3 <- xmodel.3[1, 3] ### to avoid any mistakes of using manual calculations, all of the values are obtained from the lm output #the t-value! b0_t_statistics <- xmodel[1, 4] b0_t_statistics.2 <- xmodel.2[1, 4] b0_t_statistics.3 <- xmodel.3[1, 4] #and p value! P_v0 <- xmodel[1, 5] P_v0.2 <- xmodel.2[1, 5] P_v0.3 <- xmodel.3[1, 5] ##B1 t-test statistics #we need the standard error of the x model seb1 <- xmodel[2, 3] seb1.2 <- xmodel.2[2, 3] seb1.3 <- xmodel.3[2, 3] #so we can calculate the t-value for b1: b1_t_statistics <- xmodel[2, 4] b1_t_statistics.2 <- xmodel.2[2, 4] b1_t_statistics.3 <- xmodel.3[2, 4] #and, now, its p-value: P_v1 <- xmodel[2, 5] P_v1.2 <- xmodel.2[2, 5] P_v1.3 <- xmodel.3[2, 5] ## F STATISTICS FOR THE PROPOSED MODEL # F STATISTICS F_sta <- anova(xmod)[[4]][1] F_sta.2 <- anova(xmod.2)[[4]][1] F_sta.3 <- anova(xmod.3)[[4]][1] # P VALUE P_v <- anova(xmod)[[5]][1] P_v.2 <- anova(xmod.2)[[5]][1] P_v.3 <- anova(xmod.3)[[5]][1] # DFs dfF1 <- anova(xmod)[[1]][1] dfF1 <- anova(xmod)[[1]][1] dfF1 <- anova(xmod)[[1]][1] dfF2 <- anova(xmod)[[1]][2] dfF2.2 <- anova(xmod.2)[[1]][2] dfF2.3 <- anova(xmod.3)[[1]][2] dft <- length(df$x1) - 1 dft.2 <- length(df$x2) - 1 dft.3 <- length(df$x3) - 1 ## covariance Covxy <- cov(df$x1, df$y) Covxy.2 <- cov(df$x2, df$y) Covxy.3 <- cov(df$x3, df$y) ### creating a data.frame that contains the mean of the data to plot error for the null my <- mean(df$y) meany <- rep(my, 30) #################################### ######### TEXT FOR BUTTONS ######### #################################### output$dataText <- renderUI({ if (rv$getmodel != 0) {HTML("<center>Press this button to see the raw data.</center>")} else {HTML(paste("In this view, we don't have a model. We're simply interested in looking at the raw data to get an intution of whether there is any interesting pattern in the data.", " Ask yourself: Does ", x1label, " seem to have a relationship with ", ylabel, "?"), sep = "")} }) output$nullText <- renderUI({ if (rv$getmodel != 1) {HTML("<center>Press this button to see the most basic model: the mean.</center>")} else {HTML('In this view, our model is Yi = b0. In other words, our null model is simply the mean of our dependent variable (', ylabel, '). So ask yourself: Does the mean allow me to make accurate predictions of each individual\'s ', ylabel, "or am I making a lot of errors?")} }) output$fullText <- renderUI({ if (rv$getmodel != 2) {HTML(paste("<center>Press this button to see the full model, where we use ", x1label, " to help us predict the value of ", ylabel, ".</center>"), sep = "")} else { HTML('In this view, our model is Yi = b0 + b1x1.', "In other words, we use the value of another variable (", x1label, ") to try to make better predictions of our dependent variable (", ylabel, "). Ask yourself: Am I making less errors by using ", x1label, "to predict", ylabel, "than in the null model?") } }) #################################### ###### MODEL ESTIMATION TABLE ###### #################################### ####################### ### Table captions #### ####################### output$tablecapt <- renderUI({ #### If user selected only data #### if (rv$getmodel==0){ HTML(paste("<center>This table shows the daily sugar intake (grams) and anxiety score for each participant (30 in total).</center>"), sep = "" ) #### If user selected the null model #### } else if (rv$getmodel==1){ HTML(paste("This table shows the daily sugar intake (grams) and anxiety score for each participant with the addition of the predicted sugar intake by the null model (Null Y'i) with the 'Null e'i' showing the error of that prediction for each data point. "), sep = "") #### If user selected the full model #### } else if (rv$getmodel==2) { HTML(paste("This table shows the daily sugar intake (grams) and anxiety score for each participant with the addition of the predicted sugar intake by the null model (Null Y'i) and the full model (Full Y'i) with the 'Null e'i' showing the error of that prediction for each data point for the NUll model and ' Full e'i' dispaying the error for each data point if the Full model is fitted."), sep = "") } else { HTML("<center>Please press a button to select what you wish to visualize: the raw data, the null model, or the full model.</center>") } }) #### Table headers #### nullmodlabel <- HTML('<a data-toggle="popover" data-trigger="focus" class="bootpop" title="Null model predictions">Null Y\'i</a>') nullreslabel <- HTML('<a data-toggle="popover" data-trigger="focus" class="bootpop" title="Null model errors">Null e\'i</a>') fullmodlabel <- HTML('<a data-toggle="popover" data-trigger="focus" class="bootpop" title="Full model predictions">Full Y\'i</a>') fullreslabel <- HTML('<a data-toggle="popover" data-trigger="focus" class="bootpop" title="Full model errors">Full e\'i</a>') ######################### ### Next page buttons ### ######################### observeEvent(input$Next1, { updateTabItems(session, "tabs", "researchQ") }) observeEvent(input$Next2, { updateTabItems(session, "tabs", "data") }) observeEvent(input$Next3, { updateTabItems(session, "tabs", "nullOverview") }) observeEvent(input$Next4, { updateTabItems(session, "tabs", "proposedEquations") }) observeEvent(input$Next5, { updateTabItems(session, "tabs", "otherproposedEstimation") }) observeEvent(input$Next6, { updateTabItems(session, "tabs", "comparisonEstimation") }) observeEvent(input$Next7, { updateTabItems(session, "tabs", "inferenceOverview") }) observeEvent(input$Next8, { updateTabItems(session, "tabs", "inferenceOverview") }) observeEvent(input$Next9, { updateTabItems(session, "tabs", "proposedEstimation") }) observeEvent(input$Next10, { updateTabItems(session, "tabs", "nullEquations") }) observeEvent(input$Next11, { updateTabItems(session, "tabs", "nullEstimation") }) observeEvent(input$Next12, { updateTabItems(session, "tabs", "proposedOverview") }) observeEvent(input$Next13, { updateTabItems(session, "tabs", "comparisonOverview") }) observeEvent(input$Next14, { updateTabItems(session, "tabs", "anovaMain") }) observeEvent(input$Next15, { updateTabItems(session, "tabs", "inferenceMain") }) observeEvent(input$Next16, { updateTabItems(session, "tabs", "reporting") }) observeEvent(input$Next17, { updateTabItems(session, "tabs", "anovaOverview") }) ########################### #### Data page outputs #### ########################### output$tableData <- shiny::renderDataTable({ cc <- data.frame(df$id, df$y, df$x1, df$x2, df$x3 ) colnames(cc) <- c(idlabel, ylabel, x1label, x2label, x3label) cc }, options = list(searching = FALSE, pageLength = 9, lengthChange = FALSE, ordering = FALSE, paging = FALSE ) , escape = FALSE) output$plotData <- renderPlot({ plot(data.frame(df$x1 , df$y), xlab = x1label, ylab = ylabel,pch= 16, ylim=c(150, 400), xlim=c(30, 60)) }) ################################# #### Null model page outputs #### ################################# output$tableNull <- shiny::renderDataTable({ aa <-data.frame(df$id, df$y, round(mean(df$y), 2), Ymodelerror) colnames(aa) <- c(idlabel, ylabel, nullmodlabel, nullreslabel) aa }, options = list(searching = FALSE, pageLength = 9, lengthChange = FALSE, ordering = FALSE, paging = FALSE ), escape = FALSE) output$plotNull <- renderPlot({ plot(data.frame(df$id , df$y), xlab = idlabel, ylab = ylabel, pch= 16, ylim=c(150, 400), xlim=c(1, 30), # make Tom and Jerry's points bigger and colorful col = ifelse(df$y == 249, "green3" , ifelse(df$y == 157, "red", "black")), cex = ifelse(df$y == 249 | df$y < 170, 1.5 , 1) ) abline(h = mean(df$y), col="#337ab7") points(meany~df$id, bg = '#337ab7', pch= 19, col = ifelse(df$y == 249, "green3" , ifelse(df$y == 157, "red", "#337ab7")), cex = ifelse(df$y == 249 | df$y == 157, 1.5 , 1)) segments(df$id, df$y, df$id, meany, col = ifelse(df$y == 249, "green3" , ifelse(df$y == 157, "red", "black"))) text(23, 230, "Tom :)", col = "green3") text(6, 200, "Jerry :(", col = "red") }) output$table <- shiny::renderDataTable({ #### If user selected only data #### if (rv$getmodel==0){ cc <- data.frame(df$y, df$x1 ) colnames(cc) <- c(ylabel, x1label) cc #### If user selected the null model #### } else if (rv$getmodel==1){ cc <- data.frame(df$y,df$x1, round(mean(df$y), 2), Ymodelerror) colnames(cc) <- c(ylabel, x1label, nullmodlabel, nullreslabel) cc #### If user selected the full model #### } else if (rv$getmodel==2) { cc <-data.frame(df$y, df$x1,prediction,fullmodelerror) colnames(cc) <- c(ylabel, x1label, fullmodlabel, fullreslabel) cc } ######################### Table settings ######################### ## Check https://datatables.net/reference/option/ for reference ## }, options = list(searching = FALSE, pageLength = 9, lengthChange = FALSE, ordering = FALSE ) , escape = FALSE) ##################################### #### Proposed model + More proposed Models page outputs #### ##################################### output$tableProposed <- shiny::renderDataTable({ aa <-data.frame(df$id, df$y, df$x1,prediction,fullmodelerror) colnames(aa) <- c(idlabel, ylabel, x1label, fullmodlabel, fullreslabel) aa }, options = list(searching = FALSE, pageLength = 9, lengthChange = FALSE, ordering = FALSE ), escape = FALSE) output$tableProposed.2 <- shiny::renderDataTable({ aa <-data.frame(df$id, df$y, df$x2,prediction.2,fullmodelerror.2) colnames(aa) <- c(idlabel, ylabel, x2label, fullmodlabel, fullreslabel) aa }, options = list(searching = FALSE, pageLength = 9, lengthChange = FALSE, ordering = FALSE ), escape = FALSE) output$tableProposed.3 <- shiny::renderDataTable({ aa <-data.frame(df$id, df$y, df$x3,prediction.3,fullmodelerror.3) colnames(aa) <- c(idlabel, ylabel, x3label, fullmodlabel, fullreslabel) aa }, options = list(searching = FALSE, pageLength = 9, lengthChange = FALSE, ordering = FALSE ), escape = FALSE) output$plotProposed <- renderPlot({ plot(data.frame(df$x1 , df$y), xlab = x1label, ylab = ylabel, pch= 16, ylim=c(150, 400), xlim=c(30, 60), col = ifelse(df$y == 249, "green3" , ifelse(df$y == 157, "orange", "black")), cex = ifelse(df$y == 249 | df$y < 170, 1.5 , 1)) abline( a=b0, b=b1,h=,col="#FA8072") text(10, 46, paste('Intercept (b0) =', round(xmodel[1,2],2)), pos = 4) text(10, 41, paste('Slope (b1) =', round(xmodel[2,2],2)), pos = 4) points(prediction ~ df$x1, bg = '#FA8072', pch= 19, col = ifelse(df$y == 249, "green3" , ifelse(df$y == 157, "orange", "#FA8072")), cex = ifelse(df$y == 249 | df$y == 157, 1.5 , 1)) segments(df$x1, df$y, df$x1, prediction, col = ifelse(df$y == 249, "green3" , ifelse(df$y == 157, "orange", "black"))) text(46, 200, "Tom", col = "green3") text(56, 170, "Jerry", col = "orange") }) output$plotProposed.2 <- renderPlot({ plot(data.frame(df$x2 , df$y), xlab = x2label, ylab = ylabel, pch= 16, ylim=c(150, 400), xlim=c(150, 200), col = ifelse(df$y == 249, "green3" , ifelse(df$y == 157, "red", "black")), cex = ifelse(df$y == 249 | df$y < 170, 1.5 , 1)) abline( a=b0.2, b=b1.2,h=,col="lightgrey") text(10, 46, paste('Intercept (b0) =', round(xmodel.2[1,2],2)), pos = 4) text(10, 41, paste('Slope (b1) =', round(xmodel.2[2,2],2)), pos = 4) points(prediction.2 ~ df$x2, bg = '#FA8072', pch= 19, col = ifelse(df$y == 249, "green3" , ifelse(df$y == 157, "red", "lightgrey")), cex = ifelse(df$y == 249 | df$y == 157, 1.5 , 1)) segments(df$x2, df$y, df$x2, prediction.2, col = ifelse(df$y == 249, "green3" , ifelse(df$y == 157, "red", "black"))) text(195, 200, "Tom", col = "green3") text(177, 175, "Jerry", col = "red") }) output$plotProposed.3 <- renderPlot({ plot(data.frame(df$x3 , df$y), xlab = x3label, ylab = ylabel, pch= 16, ylim=c(150, 400), xlim=c(15, 28), col = ifelse(df$y == 249, "orange" , ifelse(df$y == 157, "darkorange2", "black")), cex = ifelse(df$y == 249 | df$y < 170, 1.5 , 1)) abline( a=b0.3, b=b1.3,h=,col="lightgrey") text(10, 46, paste('Intercept (b0) =', round(xmodel.3[1,2],2)), pos = 4) text(10, 41, paste('Slope (b1) =', round(xmodel.3[2,2],2)), pos = 4) points(prediction.3 ~ df$x3, bg = '#FA8072', pch= 19, col = ifelse(df$y == 249, "orange" , ifelse(df$y == 157, "darkorange2", "lightgrey")), cex = ifelse(df$y == 249 | df$y == 157, 1.5 , 1)) segments(df$x3, df$y, df$x3, prediction.3, col = ifelse(df$y == 249, "orange" , ifelse(df$y == 157, "darkorange2", "black"))) text(24.5, 225, "Tom", col = "orange") text(18.4, 180, "Jerry", col = "darkorange2") }) ################################ #### Inference page outputs #### ################################ #### Let's create a function to make some buttons shinyInput <- function(FUN, len, id, ...) { inputs <- character(len) for (i in seq_len(len)) { inputs[i] <- as.character(FUN(paste0(id, i), ...)) } inputs } #### Table headers #### Term <- HTML('<a data-toggle="popover" data-trigger="focus" class="bootpop" title="Our parameters: b0 and b1"<center>Term</center>') Estimate <- HTML('<a data-toggle="popover" data-trigger="focus" class="bootpop" title="Estimation for each parameter"<center>Estimate</center>') S_Error <- HTML('<a data-toggle="popover" data-trigger="focus" class="bootpop" title="Standard error for each estimated parameter"<center>Std. Error</center>') T_Value <- HTML('<a data-toggle="popover" data-trigger="focus" class="bootpop" title="Test statistics (ratio between systematic and unsystematic variance)"<center>t value</center>') P_Value <- HTML('<a data-toggle="popover" data-trigger="focus" class="bootpop" title="Probability of obtaining an equal or more extreme t value if H0 is true."<center>p-value</center>') #### Then mannually create the content of the table so we can format every element labelsc <- c(Term, Estimate, S_Error, T_Value, P_Value) if (xmodel[1,5] > 0.0005) { pvalueb1 <- sprintf("%.2f",round(xmodel[1,5], 3)) } else { pvalueb1 <- "< 0.001" } if (xmodel[2,5] > 0.001) { pvalueb2 <- sprintf("%.2f",round(xmodel[2,5], 3)) } else { pvalueb2 <- "< 0.001" } row1 <- c(shinyInput(actionButton, 1, 'button_1', label = "Intercept", onclick = 'Shiny.onInputChange(\"select_button\", this.id)' ), sprintf("%.2f",round(xmodel[1,2],2)), sprintf("%.2f",round(xmodel[1,3],2)), sprintf("%.2f",round(xmodel[1,4],2)), pvalueb1 ) row2 <- c(shinyInput(actionButton, 1, 'button_2', label = x1label, onclick = 'Shiny.onInputChange(\"select_button\", this.id)' ), sprintf("%.2f",round(xmodel[2,2],2)), sprintf("%.2f",round(xmodel[2,3],2)), sprintf("%.2f",round(xmodel[2,4],2)), pvalueb2 ) tablecof <- rbind(row1, row2) colnames(tablecof) <- labelsc rownames(tablecof) <- NULL ## Then the button ## buttonRV <- reactiveValues(RV = NULL) observeEvent(input$showTstat, { buttonRV$RV <- TRUE }) # Let's compile the introductory text for the Coeff table output$introCoeff <- renderUI({ if (is.null(buttonRV$RV)) return() HTML('Below is the regression summary table: ') }) #### Full Model output table ### output$tableTS <- DT::renderDataTable({ if (is.null(buttonRV$RV)) return() tablecof} , class = 'compact' , options = list(searching = FALSE, pageLength = 9, lengthChange = FALSE, ordering = FALSE, paging= FALSE, info = FALSE ) , server = FALSE, escape = FALSE, selection = 'none') #### Initiate the variables that the buttons will change #### inferenceV <- reactiveValues(betas = '') #### Observe the button clicking ##### observeEvent(input$select_button, { selectedRow <- as.numeric(strsplit(input$select_button, "_")[[1]][2]) if (selectedRow == 11) { inferenceV$betas <- "b0" } else if (selectedRow == 21) { inferenceV$betas <- "b1" } }) f <- summary(xmod)$fstatistic Radj <- summary(xmod)$adj.r.squared R <- summary(xmod)$r.squared pF <- round(pf(q=f[1], df1=f[2], df2=f[3], lower.tail=FALSE)) if (pF < 0.0001) { pF <- "< 0.0001" } # Let's compile the R^2 text output$R2 <- renderUI({ if (is.null(buttonRV$RV)) return() HTML( paste("Multiple R-squared: ", round(R, 4), ", Adjusted R-squared: ", round(Radj, 4), sep=""), " ", paste("F-statistic: ", round(f[1],4), " on ", round(f[2],4), " and ", round(f[3],4), " DF, p-value: ", pF, sep ="") ) }) # Let's compile the introductory text for the SSE table output$introSSE <- renderUI({ if (is.null(buttonRV$RV)) return() HTML(' Below is the SSE of both models (null and proposed), as well as the remaining (i.e., residual) sum of squared errors: ') }) # Let's compile a table with SSE output$tableSSE <- renderTable({ if (is.null(buttonRV$RV)) return() tableSSE <- data.frame(round(SSEnull,2), round(SSEfull,2), round((SSEnull - SSEfull), 2)) colnames(tableSSE) <- c("Sum of Squared Errors for the Null Model", "Sum of Squared Errors for the Proposed Model","Residual Sum of Squared Errors") tableSSE }) output$reporting1 <- renderUI ({ HTML(paste("The unstandardized coefficient way: The results show a significant negative effective of hours of training on finishing time in the marathon, b = ", round(xmodel[2,2],2), ", t(", dft, ") = ", round(xmodel[2,4],2), ", p = .001.", sep="")) }) betaST <- round(summary(lm(scale(y) ~ scale(x1)))$coefficients[[2]], 2) tforcompare <- round(summary(lm(scale(y) ~ scale(x1)))$coefficients[2,3], 2) output$reporting2 <- renderUI ({ HTML(paste("The standardized coefficient way: The results show a significant negative effective of hours of training on finishing time in the marathon, β = ", betaST, ", t(", dft, ") = ", round(xmodel[2,4],2), ", p = .001.", sep="")) }) # observeEvent(input$showR2, { # buttonRV$R2 <- TRUE # }) # Let's show R2 output$RtoPrint <- renderUI({ HTML(paste('In this case there is no need for that because the test for the effect size is the same as the test for the b1 as there is only one quantitative independent variable (hours of training). Don\'t believe us? Here is the proof: Note that the F distribution is simply a squared t distribution. So, if all the effect size is due to b1, the square t value for b1 should be the same as the F value! Do the math and see for yourself (a little discrepancy is to expected due to rounding):', " ►t value of b1 = ", tforcompare, " ►F value of R2 = ", round(f[1],2), sep="")) }) ############ Deadline ################## output$table <- shiny::renderDataTable({ #### If user selected only data #### if (rv$getmodel==0){ cc <- data.frame(df$y, df$x1 ) colnames(cc) <- c(ylabel, x1label) cc #### If user selected the null model #### } else if (rv$getmodel==1){ cc <- data.frame(df$y,df$x1, round(mean(df$y), 2), Ymodelerror ) colnames(cc) <- c(ylabel, x1label, nullmodlabel, nullreslabel) cc #### If user selected the full model #### } else if (rv$getmodel==2) { cc <-data.frame(df$y, df$x1,prediction,fullmodelerror) colnames(cc) <- c(ylabel, x1label, fullmodlabel, fullreslabel) cc } ######################### Table settings ######################### ## Check https://datatables.net/reference/option/ for reference ## }, options = list(searching = FALSE, pageLength = 9, lengthChange = FALSE, ordering = FALSE ) , escape = FALSE) #################################### ###### MODEL ESTIMATION PLOT ####### #################################### output$tableComparison <- shiny::renderDataTable({ aa <-data.frame(df$id, df$y, df$x1, round(mean(df$y), 2), Ymodelerror, prediction,fullmodelerror) colnames(aa) <- c(idlabel, ylabel, x1label, nullmodlabel, nullreslabel, fullmodlabel, fullreslabel) aa }, options = list(searching = FALSE, pageLength = 9, lengthChange = FALSE, ordering = FALSE ), escape = FALSE) output$plotComparison<- renderPlot({ plot(data.frame(df$x1 , df$y), xlab = x1label, ylab = ylabel, pch= 16, ylim=c(150, 400), xlim=c(30, 60)) abline( a=b0, b=b1,h=,col="#FA8072") abline(h=mean(df$y),col="#337ab7") text(10, 46, paste('Intercept (b0) =', round(xmodel[1,2],2)), pos = 4) text(10, 41, paste('Slope (b1) =', round(xmodel[2,2],2)), pos = 4) points(prediction ~ df$x1, col ='#FA8072', bg = '#FA8072', pch= 19) segments(df$x1, df$y, df$x1, prediction) }) ###################### ####Plot captions##### ###################### output$Cap <- renderText({ #### If user selected only data #### if (getplot()==0) { if (rv$getmodel==0){ paste('This plot shows the daily sugar intake(grams) and anxiety score plotted for each participant.', sep = "" ) #### If user selected the null model #### } else if (rv$getmodel==1){ paste("This plot shows the daily sugar (grams) intake and anxiety score for each participant, with an addition of the line that represents the null model-when no relationship is predicted. The black lines from each data point to the line represent the residual error for each point if the null model is fitted.", sep = "") #### If user selected the full model #### } else if (rv$getmodel==2) { paste("This plot shows the daily sugar intake and anxiety score for each participant. The line of best fit added has been added. The black lines from each data point to the line of best fit represent. the residual error when the full model is fitted.", sep = "") } } else { #### When data is selected #### if (rv$getmodel==0){ paste("Without a model, no predictions are made, so no errors can be calculated.", sep = "") } #### When null model is selected #### else if (rv$getmodel==1) { paste("This histogram shows the Sum Squared Errors (SSE) of the null model.", sep = "") } #### When full model is selected #### else if (rv$getmodel==2) { paste("This histogram allows us to compare the Sum Squared Errors (SSE) of the null model with those of the full model. It also allows us to check the amount of error we have reduced by adopting the full model.", sep = "") } } }) output$plot <- renderPlot({ ##################### #### SCATTERPLOT #### ##################### if (getplot()==0) { #### When data is selected #### if (rv$getmodel==0){ plot(data.frame(df$x1 , df$y), xlab = x1label, ylab = ylabel, ylim=c(150, 400), xlim=c(30, 60)) } #### When null model is selected #### else if (rv$getmodel==1){ plot(data.frame(df$x1 , df$y), xlab = x1label, ylab = ylabel, ylim=c(150, 400), xlim=c(30, 60)) abline(h=mean(df$y),col="#337ab7") points(meany~df$x1, col ='#337ab7', bg = '#337ab7', pch= 19) segments(df$x1, df$y, df$x1, meany) } #### When full model is selected #### else if (rv$getmodel==2){ plot(data.frame(df$x1 , df$y), xlab = x1label, ylab = ylabel, pch= 16, ylim=c(150, 400), xlim=c(30, 60)) abline( a=b0, b=b1,h=,col="#FA8072") abline(h=mean(df$y),col="#337ab7") text(10, 46, paste('Intercept (b0) =', round(xmodel[1,2],2)), pos = 4) text(10, 41, paste('Slope (b1) =', round(xmodel[2,2],2)), pos = 4) points(prediction ~ df$x1, col ='#FA8072', bg = '#FA8072', pch= 19) segments(df$x1, df$y, df$x1, prediction) } ################### #### HISTOGRAM #### ################### } else { #### When null model is selected #### if (rv$getmodel==0){ } #### When null model is selected #### else if (rv$getmodel==1){ barplot(c(SSEnull,0,0),col=c("red","red","red"), names.arg =(c("Null model error","","")), ylab = "SSE" ) legend("topright", inset=.05, cex = 1, title="SSE", paste(round(SSEnull,2)), horiz=FALSE, lty=c(1,1), lwd=c(2,2), col=c("red")) } #### When full model is selected #### else if (rv$getmodel==2) { barplot(c(SSEnull,SSEfull,SSEnull-SSEfull),col=c("red","blue","grey"), names.arg =(c("Null model error","Full model error","Reduced error")), ylab = "SSE") legend("topright", inset=.05, cex = 1, title="SSE", legend = round(SS,2), horiz=FALSE, lty=c(1,1), lwd=c(2,2), col=c("red","blue","grey")) } } }) #################################### ###### MODEL ESTIMATION PLOT ####### #################################### output$tableAnova <- shiny::renderDataTable({ aa <- data.frame (aggregate (y ~ x4, data = df, FUN = mean)) colnames(aa) <- c (x4label, ylabel) aa }, options = list(searching = FALSE, pageLength = 9, lengthChange = FALSE, ordering = FALSE, paging = FALSE, info = FALSE ), escape = FALSE) output$tableReg <- shiny::renderDataTable({ regcoef <- summary (lm (y~x4, data = df)) aa <- data.frame (regcoef$coefficients [, 1]) b <- c ("American", "Asian", "European") c <- c ("", "(303.1 - 248.7)", "(215.3 - 248.7)") aa <- cbind (b, aa, c) colnames(aa) <- c (x4label, "Coefficients", "") aa }, options = list(searching = FALSE, pageLength = 9, lengthChange = FALSE, ordering = FALSE, paging = FALSE, info = FALSE ), escape = FALSE) output$anovaStats <- renderPrint({ anova(lm(y ~ x4, df)) }) output$regStats <- renderPrint({ summary(lm(y ~ x4, df)) }) #################################### ######### INFERENCE TABLES ######### #################################### #### Let's create a function to make some buttons shinyInput <- function(FUN, len, id, ...) { inputs <- character(len) for (i in seq_len(len)) { inputs[i] <- as.character(FUN(paste0(id, i), ...)) } inputs } #### Table headers #### Term <- HTML('<a data-toggle="popover" data-trigger="focus" class="bootpop" title="Our parameters: b0 and b1"<center>Term</center>') Estimate <- HTML('<a data-toggle="popover" data-trigger="focus" class="bootpop" title="Estimation for each parameter"<center>Estimate</center>') S_Error <- HTML('<a data-toggle="popover" data-trigger="focus" class="bootpop" title="Standard error for each estimated parameter"<center>Std. Error</center>') T_Value <- HTML('<a data-toggle="popover" data-trigger="focus" class="bootpop" title="Test statistics (ratio between systematic and unsystematic variance)"<center>t value</center>') P_Value <- HTML('<a data-toggle="popover" data-trigger="focus" class="bootpop" title="Probability of obtaining an equal or more extreme t value if H0 is true."<center>p-value</center>') #### Then mannually create the content of the table so we can format every element labelsc <- c(Term, Estimate, S_Error, T_Value, P_Value) if (xmodel[1,5] > 0.0005) { pvalueb1 <- sprintf("%.2f",round(xmodel[1,5], 3)) } else { pvalueb1 <- "< 0.001" } if (xmodel[2,5] > 0.001) { pvalueb2 <- sprintf("%.2f",round(xmodel[2,5], 3)) } else { pvalueb2 <- "< 0.001" } row1 <- c(shinyInput(actionButton, 1, 'button_1', label = "Intercept", onclick = 'Shiny.onInputChange(\"select_button\", this.id)' ), sprintf("%.2f",round(xmodel[1,2],2)), sprintf("%.2f",round(xmodel[1,3],2)), sprintf("%.2f",round(xmodel[1,4],2)), pvalueb1 ) row2 <- c(shinyInput(actionButton, 1, 'button_2', label = x1label, onclick = 'Shiny.onInputChange(\"select_button\", this.id)' ), sprintf("%.2f",round(xmodel[2,2],2)), sprintf("%.2f",round(xmodel[2,3],2)), sprintf("%.2f",round(xmodel[2,4],2)), pvalueb2 ) tablecof <- rbind(row1, row2) colnames(tablecof) <- labelsc rownames(tablecof) <- NULL #### Full Model output table ### output$tablec <- DT::renderDataTable(tablecof, options = list(searching = FALSE, pageLength = 9, lengthChange = FALSE, ordering = FALSE, paging= FALSE, info = FALSE ) , server = FALSE, escape = FALSE, selection = 'none') #### Initiate the variables that the buttons will change #### inferenceV <- reactiveValues(betas = '') #### Observe the button clicking ##### observeEvent(input$select_button, { selectedRow <- as.numeric(strsplit(input$select_button, "_")[[1]][2]) if (selectedRow == 11) { inferenceV$betas <- "b0" } else if (selectedRow == 21) { inferenceV$betas <- "b1" } }) output$toDisplay <- renderText({ paste(inferenceV$betas) }) ### SOLVED EQUATION FOR THE PROPOSED MODEL### output$propmodelText <- renderUI({ withMathJax("$$ {\\ Y'_{i}\\,=\\, ",round(xmodel[1,2], digits=2),"+",round(xmodel[2,2], digits=2)," * ", x1label,"_{i} }$$") }) ### SCATTER PLOT CAPTIONS ### output$betaText <- renderUI({ if (inferenceV$betas =="b0") {HTML("This plot shows the ",ylabel," and ",x1label," score for each participant. The intercept represents the ",ylabel," when the ",x1label," level is 0.")} else if (inferenceV$betas == "b1"){HTML("This plot shows the ",ylabel," and ",x1label," score for each participant. The slope represents the change of ",ylabel," when ",x1label," increases by one unit.")} }) #### SCATTER PLOTS ##### output$betaplot <- renderPlot({ #### When b0 is selected #### if (inferenceV$betas =="b0"){ plot(data.frame(df$x1 , df$y), main = "Estimate", xlab = x1label, ylab = ylabel, xlim=c(0, max(x1)), ylim=c(min(y), 533), pch= 16) abline( a=b0, b=b1,h=,col="gray") text(0, (xmodel[1,2]-5), paste('Intercept (b0) =', round(xmodel[1,2],2)), pos = 4, col="blue") points(0,xmodel[1,2],pch=22,col="blue", bg = "blue") } #### When b1 is selected #### else if (inferenceV$betas == "b1"){ plot(data.frame(df$x1 , df$y), main = "Estimate", xlab = x1label, ylab = ylabel, xlim=c(0, max(x1)), ylim=c(min(y), 533), pch= 16) abline( a=b0, b=b1,h=,col="gray") text(0, (xmodel[1,2]-1), paste('Slope (b1) =', round(xmodel[2,2],2)), pos= 4, col="chartreuse4") segments(0,xmodel[1,2],1,xmodel[1,2],col="chartreuse4") segments(1,xmodel[1,2],1,(xmodel[1,2]+xmodel[2,2]), col="chartreuse4") } }) ### T DISTRIBUTION CAPTIONS ### output$tText <- renderUI({ if (inferenceV$betas =="b0") { withMathJax("$$ {\\ t_{b0}\\,=\\, \\frac{",round(xmodel[1,2], digits=2),"}{",round(xmodel[1,3], digits=2),"} = ", round(xmodel[1,4], digits=2)," }$$") } else if (inferenceV$betas == "b1"){ withMathJax("$$ {\\ t_{b1}\\,=\\, \\frac{",round(xmodel[2,2], digits=2),"}{",round(xmodel[2,3], digits=2),"} = ", round(xmodel[2,4], digits=2)," }$$") } }) ################################################### ####Function for shading the critical value area### ################################################### # colorArea <- function(from, to, density, ..., col="blue", dens=NULL){ # y_seq <- seq(from, to, length.out=500) # d <- c(0, density(y_seq, ...), 0) # polygon(c(from, y_seq, to), d, col=col, density=dens) # } #### T PLOTS ##### output$tplot <- renderPlot({ #### When t0 is selected #### if (inferenceV$betas =="b0"){ ##Prints t distribution plot curve(dt(x, df=dft), from=-12, to=12, main="t value", xlab = "T-student's distribution", ylab = 'Probability Density') abline(v=b0_t_statistics,col="blue") abline(v=(qt(c(.95), df=dft)),col="red") abline(v=-(qt(c(.95), df=dft)),col="red") text(b0_t_statistics,0.2, paste('t value =', round(b0_t_statistics,2)), pos= 2, col="blue") text(6.5,0.1, paste('critical value = [', round(qt(c(.95), df=dft),2), ",", round(-qt(c(.95), df=dft),2), "]"), pos=3, offset=2, col="red") # colorArea(from=qnorm(0.95), to=10, dnorm, mean=0, sd=1, col=2, dens=60) } #### When t1 is selected #### else if (inferenceV$betas =="b1"){ curve(dt(x, df=dft), from=-12, to=12, main="t value", xlab = "T-student's distribution", ylab = 'Probability Density') abline(v=b1_t_statistics,col="chartreuse4") abline(v=(qt(c(.95), df=dft)),col="red") abline(v=-(qt(c(.95), df=dft)),col="red") text(b1_t_statistics,0.2, paste('t value =', round(b1_t_statistics,2)), pos= 2, col="chartreuse4") text(6.5,0.1, paste('critical value = [', round(qt(c(.95), df=dft),2), ",", round(-qt(c(.95), df=dft),2), "]"), pos=3, offset=2, col="red") # colorArea(from=qnorm(0.95), to=10, dnorm, mean=0, sd=1, col=2, dens=60) } }) ########################## ### HYPOTHESIS TESTING ### ########################## output$plotsLabel <- renderUI({ if (inferenceV$betas =="b0"){ HTML(paste(" The effect here is ", round(xmodel[1,2],2), " and represents the marathon finishing time of someone who hasn\'t trained at all (i.e., trained for 0 hours). The error for the intercept is ", round(xmodel[1,3],2), ". In this sense the test statistics is ", round(xmodel[1,2],2), " / ", round(xmodel[1,3],2), " = ", round(xmodel[1,4],2), ". The likelihood to have a test statistics of ", round(xmodel[1,4],2), " or a more extreme is ", pvalueb1, ". ", sep="")) } else if (inferenceV$betas =="b1"){ HTML(paste(" The effect here is ", round(xmodel[2,2],2), " and represents the improvement in finishing time per hour of trainning (i.e., if you train for one hour, how many minutes earlier should you expect to finish?). The error for the intercept is ", round(xmodel[2,3],2), ". In this sense the test statistics is ", round(xmodel[2,2],2), " / ", round(xmodel[2,3],2), " = ", round(xmodel[2,4],2), ". The likelihood to have a test statistics of ", round(xmodel[2,4],2), " or a more extreme is ", pvalueb2, ". ", sep="")) } }) output$belowText <- renderUI({ if (rv$getmodel != 3) {""} else {HTML('<center>Scroll down to learn more about hypothesis testing!</center>')} }) output$regressionStats <- renderPrint({ summary(lm(y ~ x1)) }) output$testingText <- renderUI({ #### If the Hypothesis Testing button is pressed #### if (rv$getmodel==3) if (xmodel[1,5] < 0.05) {HTML('Logic of Hypothesis Testing Background When interpreting an experimental finding, a question arises as to whether the findings reflect a true effect or have occurred by chance. Hypothesis testing is a statistical procedure that is used for testing whether chance is a plausible explanation of the experimental findings. Misconceptions about hypothesis testing are common among researchers and students. Here we will outline the basic logic behind hypothesis testing. Procedure We want to test the following hypotheses: H0: b1 = 0 which means ',x1label,' does not affect' ,ylabel,'. H1: b1 ≠ 0 which means ',x1label,' has an impact on' ,ylabel,'. With t = ',round(xmodel[2,4],2),', p ',pvalueb2,' we can assume b1 is significantly different from 0. But what does that mean? It means the data seems to support the hypothesis that with higher ',x1label,' levels, the ' ,ylabel,' also increases. ')} else {HTML('Logic of Hypothesis Testing When interpreting an experimental finding, a question arises as to whether the findings reflect a true effect or have occurred by chance. Hypothesis testing is a statistical procedure that is used for testing whether chance is a plausible explanation of the experimental findings. Misconceptions about hypothesis testing are common among researchers and students. Here we will outline the basic logic behind hypothesis testing. We want to test the following hypotheses: H0: b1 = 0 which means ',x1label,' does not affect ' ,ylabel,'. H1: b1 ≠ 0 which means ',x1label,' has an impact on ' ,ylabel,'. With t = ',round(xmodel[2,4],2),', p ',pvalueb2,' we can assume b1 is not significantly different from 0. But what does that mean? It means the data seems to notsupport the hypothesis that with higher ',x1label,' levels, the ' ,ylabel,' also increases. ')} }) }

#####PACKAGES########## library(shiny) library(ggplot2) library(calibrate) library(MASS) library(shinyBS) library(broom) library(DT) library(shinydashboard) ############################ ###### USER INTERFACE ###### ############################ ui = dashboardPage( dashboardHeader(title = "Model Comparison Approach App"), dashboardSidebar( #################### ### SIDEBAR MENU ### #################### sidebarMenu(id="tabs", menuItem("Introduction", icon = icon("question"), menuSubItem("Start here", tabName="intro"), menuSubItem("Research Question", tabName="researchQ"), menuSubItem("Data", tabName="data") ), menuItem("Null Model", tabName = "nullmodel", icon = icon("th"), menuSubItem("Overview", tabName="nullOverview"), menuSubItem("Equations", tabName="nullEquations"), menuSubItem("Null model", tabName="nullEstimation") ), menuItem("Proposed Model", tabName = "propmodel", icon = icon("signal"), menuSubItem("Overview", tabName="proposedOverview"), menuSubItem("Equations", tabName="proposedEquations"), menuSubItem("Proposed model", tabName="proposedEstimation"), menuSubItem("Other proposed models", tabName="otherproposedEstimation") ), menuItem("Model Comparison", tabName = "comparison", icon = icon("clone"), menuSubItem("Overview", tabName="comparisonOverview"), menuSubItem("Model comparison", tabName="comparisonEstimation") ), menuItem("Inference", tabName = "inference", icon = icon("hourglass-half"), menuSubItem("Overview", tabName="inferenceOverview"), menuSubItem("Inference", tabName="inferenceMain"), menuSubItem("Result reporting", tabName="reporting") ), menuItem("Anova vs Regression", tabName = "anova", icon = icon("balance-scale"), menuSubItem("Overview", tabName="anovaOverview"), menuSubItem("Similarities and Differences", tabName="anovaMain") ), menuItem("Glossary", tabName = "glossary", icon = icon("list")) )), dashboardBody( includeScript("../../../Matomo-tquant.js"), fluidRow( tags$head( tags$link(rel = "stylesheet", type = "text/css", href = "formatboot.css")), ##################### ## 1st PAGE - DATA ## ##################### tabItems( tabItem("intro", box(width=12, h3("WHAT IS THIS ALL ABOUT?", align="center"), p("With this app you can explore basic statistics concepts, compare different models and see how they all boil down to the same recipe:"), h4("Data = model + error", align="middle"), br(), p("The curious human being that's inside yourself asks you questions all the times about phenomena, and you really would like to satisfy your unstoppable thirst for knowledge, somehow. We have a tool that might help you out.") ),p(actionButton(inputId="Next1", label=HTML('Next page ')), align="right") ), tabItem("researchQ", box(width=12, h3("YOUR FIRST RESEARCH QUESTION",align="center"), p("Imagine soon a marathon will take place close where you live, and all your friends are taking part in it. You would like to join them, but you don't want to suck. So you decide to look around for what is important in order to succeed in a loooong run. You have some ideas, maybe your BMI is important? Maybe your height? Or is it more about how much you have been training?"), p("You are lucky, the organizers of last year marathon collected information, and you can access this dataset! But what can you do with all these numbers?"), br(), p(img(src = "marathonRunner.jpg", width="50%"), align="middle") ) ,p(actionButton(inputId="Next2", label=HTML('Next page ')), align="right") ), tabItem("data", box(width=12, h3(HTML("THE DATASET"), align="middle"), p("So, let's first have a look at the information about the last year's marathon. Here is a challenge for you! Based on the dataset, try to answer these questions:"), p("1. How many people took part in the marathon last year?"), p("2. How much time did it take for participant number seven to complete the run?"), p("3. How many hours a week did she train?")), box(width=12, shiny::dataTableOutput("tableData"),style = "height:500px; overflow-y: scroll;") ,p(actionButton(inputId="Next3", label=HTML('Next page ')), align="right") ), ########################## ## 2nd - Null MODEL ## ########################## tabItem("nullOverview", box(width=12, HTML("What have you learned so far? The basic recipe for any model is:"), h4("Data = model + error", align="middle"), HTML(" By now you should be familiar with the first part of the recipe (DATA): ►You gained some insights into your research question and how to go to answer it. ►You have seen the available data, and briefly explored it. But this is just the first step!"), HTML("What are you going to learn next? ► You are going to get to know the second part of the recipe: what is a MODEL? What is ERROR? ► What is the null model? ► What are the parameters of the null model?"), HTML("Research question: You aim at understanding what you should do to prepare for the marathon. A couple of your friends participated last year, and you decide to get some inspiration by looking at what they did.") ), p(actionButton(inputId="Next10", label=HTML('Next page ')), align="right") ), tabItem("nullEquations", box(width=12, h1('YOUR FIRST GUESS',align="center"), HTML(" You are curious about how your friends Tom and Jerry performed last year, but you can't find their name in the dataset in which only participants'numbers are displayed. You don't have many information about them apart from the fact that they were there. What is the best guess you can make about their time? You can now use your recipe to answer this question. The 'i' in brackets indicates the individual partcipant. "), h4("Data(i) = model + error(i)", align="middle"), HTML("The best guess would probably be the most frequent time people spent to complete the marathon. In statistical language this is called 'the mode', and in special cases (e.g. when data are normally distributed), the mode equals the mean value. Let's assume that this is the case, you calculate the mean time and you find that it was about 4 hours and 15 minutes, so you go for that. "), h4("Time (Jerry) = mean time + error(Jerry)", align="middle"), h4("Time (Tom) = mean time + error(Tom)", align="middle"), HTML(" When you meet your friends, you decide to actually ask them for the real time. Jerry seems to be quite offended by your guess, he had performed way better! He made it in a little more than 2 hours and a half!! However, Tom was happily surprised by your guess, it took 4 hours and 9 minutes for him! Maybe you can improve your guess (model) by considering more information and adding it into your model. "), p(actionButton(inputId="Next11", label=HTML('Next page ')), align="right") )), tabItem("nullEstimation", box(width=12, HTML("WHAT IS AN ERROR? Your prediction was better for Tom than for Jerry, so we can say that you made a greater error at predicting Jerry's time than Tom's. You can look at the next section to get a better grasp of what an error is. Look at the graph below. On the y-axis is represented last year participant's time. On the x-axis you find the participant number (id). Your model (the mean) is represented as a straight line, and it is the time that you would predict for each participant to finish the marathon. For a lot of participants, you made an error using this model. The red line represents the error you commited in your prediction for Jerry's time, whereas the green line is the error associated with the time you predicted for Tom. As you can see the model is not that good. Maybe you can improve your guess (model) by considering more information to add into your model. ")), box(width=6, shiny::dataTableOutput("tableNull"),style = "height:500px; overflow-y: scroll;"), box(width=6, plotOutput('plotNull') ) ,p(actionButton(inputId="Next12", label=HTML('Next page ')), align="right") ), ########################## ## 3rd - PROPOSED MODEL ## ########################## tabItem("proposedOverview", box(width=12, HTML("What have you learned so far? By now you should be familiar with the null model and the error in the model. "), HTML("The basic recipe for any model is:"), h4("Data(i) = model + error(i)", align="middle"), HTML("What are you going to learn next? ► What is the proposed model? ► What are the parameters of the proposed model? ► How good is the model?"), HTML("Research question: Don't forget our research question! You want to understand what is important in order to succeed in a marathon. ") ) ,p(actionButton(inputId="Next4", label=HTML('Next page ')), align="right") ), tabItem("proposedEquations", box(width=12, HTML(" As you saw in the previous section, your model was not very good at predicting your friends'performance. How can you improve it? In the dataset of last year's marathon, you found other information which might be useful, such as the hours a week each participant spent on training, BMI and their height. You decide to start by looking if the hours spent on training might be relevant for the performance in the marathon. You can now build a new model with this information and see if your predictions improve: "), h4("Marathon Finish Time = Intercept + effect of training * training + Error", align="middle"), h4("Yi = b0 + b1*x1 + Error", align="middle")), box(width=6, HTML(" Yi= outcome variable, B0 = intercept or b0 coefficient B1 = slope or b1 coefficient X1 = predictor variable ")), box(width=12, HTML(" Intercept You assume that the relationship between the two variables (time and training) can be described with a line. A person who did not train at all, has a value of 0 hours on the x-axis, and according to this model she will have to run for 7265 minutes to finish the marathon. This value is known as the intercept (b0) in the model and is represented by the point in which the line crosses the y-axis. Effect of training (b1) You would like to see if training is important for succeeding in the marathon. The answer to this question lies in the b1 coefficient. This coefficient (slope of the line) represents the strength and the direction of the relationship between the two variables, in particular it represents how much the predictor has to change in order to see a change of one unit in the outcome." ) ) ,p(actionButton(inputId="Next9", label=HTML('Next page ')), align="right") ), tabItem("proposedEstimation", box(width=12, HTML("In this page you can visually compare the length of the lines. The model seems to better predict both the times of your friends (the error is reduced). But most of all, it seems that the predictor 'Hours of training a week' is significantly related to the outcome variable (performance at the marathon). To know if that's the case you will have to compute a significance test, which you will find in the inference section. At this point you are interested to know if your predictions improved. What about comparing the null model with this new one? Go to the the 'Model comparison' section and have a look. ")), box(width=6, shiny::dataTableOutput('tableProposed')), box(width=6, plotOutput('plotProposed') ) ,p(actionButton(inputId="Next5", label=HTML('Next page ')), align="right") ), tabItem("otherproposedEstimation", mainPanel( wellPanel(style = "background-color: #ffffff;", tabsetPanel( tabPanel("General" , br(), box(width=12,HTML("<center>Ok, so seems like hours you've spent training is a pretty good predictor for your marathon finish time! Those who spend more time training are faster and have a smaller finish time for the marathon. Jerry was more dedicated to the training than Tom, so he finished earlier. But what about the other variables? Do height and body mass index tell you something about the time people need to finish the marathon? You still have more data to look at! Click on the tabs 'Height' and 'BMI' to look at the other variables.</center>"))), tabPanel("Height", br(), box(width=12, HTML("<center>When you use the participants height to estimate their marathon finishing time, the model looks like this: Marathon Finish Time = Intercept + effect of height * height + Error which translates to: </center>"), withMathJax(), "$$Y_{i} = \\beta_{0.2} + \\beta_{1.2} * x_{2i} + e_{i}$$" ), box(width=6, shiny::dataTableOutput('tableProposed.2')), box(width=6, plotOutput('plotProposed.2') ), box(width=6, HTML("<center>Oh no! This doesn't look good! Your estimate for Jerry is worse again! And the errors seem to also be bigger. Let's look if BMI is a better estimator for marathon finishing time! Click on the next tab on top called 'BMI'</center>") ) ) , tabPanel("BMI" , br(), box(width=12, HTML("<center>When you use the participants BMI to estimate their marathon finishing time, the model looks like this: Marathon Finish Time = Intercept + effect of BMI * BMI + Error which translates to: </center>") , withMathJax(), "$$Y_{i} = \\beta_{0.3} + \\beta_{1.3} * x_{3i} + e_{i}$$" ), box(width=6, shiny::dataTableOutput('tableProposed.3')), box(width=6, plotOutput('plotProposed.3') ), box(width=6, HTML("When you use the BMI for predicting the marathon performance, your estimation for Jerry is better than in the Null Model! However, the estimation for Tom got a little worse... You should probably stick to using the hours put into training for your model!") ), p(actionButton(inputId="Next13", label=HTML('Next page ')), align="right") ) ), style='width: 1000px; height: 1000px' )) ), ###################### ## 4th - COMPARISON ## ###################### tabItem(tabName = "comparisonOverview", box(width=12, HTML("What have you learned so far? By now you should be familiar with the null model and various models with one variable as a predictor. "), HTML("What are you going to learn next? ► Compare the null model with the proposed model ► Understand how to evaluate the model "), HTML("Research question: Don't forget our research question! You want to understand what is important in order to succeed in a marathon. ") ) ,p(actionButton(inputId="Next6", label=HTML('Next page ')), align="right") ), tabItem(tabName = "comparisonEstimation", box(width=12, h4("MODEL COMPARISON", align = "center"), HTML(" In the graph you can now see the two models: ►red line: proposed model, ►blue line: null model. Which one of the two models allows you to make better predictions? You can answer this question more precisely if you are able to quantify which model has the least error. You can take each participant's time and subtract the predicted time based on your model. You do this for each participant, and sum up the results. What do you obtain? The result is zero! This does not help, in fact to avoid cancelling out opposite values and in order to give each one a weight, we have to square each difference before summing them up. At this point, we have the so called 'sum of squared errors (SSE)', which we need to divide for the number of participants to obtain the 'mean sum of squared errors' (MSSE). To compare how good each model is, we need to calculate the so called 'R-squared' which is the ratio between the explained variation that the model explains (SSM) and the total variation (sum of SSE and SSM). ")), box(width=6, shiny::dataTableOutput('tableComparison')), box(width=6, plotOutput('plotComparison') ) ,p(actionButton(inputId="Next7", label=HTML('Next page ')), align="right") ), ########################## ## 5th - INFERENCE PAGE ## ########################## tabItem(tabName = "inferenceOverview", box(width=12, HTML('<h3>What you have learned so far:</h3> ► The research question: How can I predict performance in a marathon? ► The basic equation for model analysis: Data = Model + Error; ► Having a look at the raw data.</li> ► The proposed model, in which we try to use hours of training to predict marathon performance.</li> ► The parameters of the proposed model: The intercept (b0), which represents the performance of a runner with zero hours of training, and the slope (b1), which represents the gain in performance for each increment of 1 hour of training. ► The error of the proposed model: We calculate the Sum of Squared Errors in order to have a measure the amount of error we make in our predictions using the proposed model. ► The null model: Calculating the simplest model possible, which is just using mean performance to guess performance. ► The paramter of the null model: The intercept (b0), which represents mean performance. ► The error of the null model: We again calculate the Sum of Squared Errors in order to have a measure of the amount of error we make in our predictions using the null model. ► The proportion of the reduction of the error (PRE): We then see the proportion of error of the null model that we\'re able to reduce by using the proposed model instead of the null model. ') ), box(width=12, HTML('<h3>What you will learn here:</h3> ► Test statistics; ► p-value; ► Hypothesis testing; ► Data reporting in APA style. ' ) ) ,p(actionButton(inputId="Next15", label=HTML('Next page ')), align="right") ), tabItem(tabName = "inferenceMain", box(width=12, h3("Test statistic = Estimated effect / Error of the estimation", align="middle"), br(), p("This is the basic equation underlying test statistics. Test statistics basically represent an effect in the unit if a specific theoretical distribution (here, T-Student and F distributions). When we compute a test statistics we can evaluate how likely that value (or a more extreme value) is to occur. And with this we can say if an effect is happening by chance or not."), br(), conditionalPanel(condition="input.showTstat==0", p(actionButton(inputId="showTstat", label="Press to see the relevant test statistics"), align="right") ), uiOutput("introCoeff"), DT::dataTableOutput("tableTS"), uiOutput("R2"), conditionalPanel(condition="input.select_button", uiOutput("plotsLabel"), fluidRow( column(6, plotOutput("betaplot")), column(6, plotOutput("tplot")) ) ), uiOutput("introSSE"), tableOutput("tableSSE"), conditionalPanel(condition="input.select_button" ), p(actionButton(inputId="Next16", label=HTML('Next page ')), align="right") ) ), tabItem(tabName = "reporting", box(width=12, p("Here's how R would print the results of our regression."), verbatimTextOutput("regressionStats"), HTML('How would you report your results <a href="http://apastyle.org/">(APA style)</a>? The important effect here is given by the b1 parameter. So the report of the results will be on this parameter. Two different ways to do this:'), hr(), uiOutput("reporting1"), hr(), uiOutput("reporting2"), hr(), HTML("What about R2? Shouldn't we be reporting R2 as well?"), uiOutput("RtoPrint") ), p(actionButton(inputId="Next17", label=HTML('Next page ')), align="right") ), ###################### ## 6th PAGE - ANOVA ## ###################### tabItem("anovaOverview", box(width=12, h3("REGRESSION vs ONE-WAY ANOVA", align="middle"), p("Now we wil explore two cases of General Linear Models: a simple regression and a one-way ANOVA"), HTML("A very simple explanation is the following: ►Regression is the statistical model that you use to predict a continuous outcome on the basis of one or more continuous predictor variables ► ANOVA is the statistical model that you use to predict a continuous outcome on the basis of one or more categorical predictor variables.")), p(actionButton(inputId="Next14", label=HTML('Next page ')), align="right") ), tabItem(tabName = "anovaMain", box(width=12, h3("Similarities", align="center"), HTML ("1. First, both models are applicable only when you have a continuous outcome variable. A categorical outcome variable would rule out the use of either a regression model or an ANOVA model."), HTML ("2. Second, you can use the regression algorithm, which is based on the principle of least squares, to fit an ANOVA model. You don't have to use the least squares principle because there are other ways to produce the ANOVA model. But because least squares, the basis for regression models, also works for ANOVA models"), HTML ("3. Third, the concept of partitioning variation into sums of squares (SS) in an ANOVA model also provides a nice way to examine complex regression models. "), HTML(" In an ANOVA: the categorical variable is effect coded, which means that each categosy's mean is compared to the grand mean"), HTML(" In a Regression: the categorical variable is dummy coded, which means that each category's intercept is compared to the reference group's intercept. Since the intercept is defined as the mean value when all the other predictors are equal to zero, and there are no other predictors, the three intercepts are just means."), br (), h4 ("Look at the outputs below"), HTML ("The intercept in the regression model is simply the mean of the reference group (i.e. the American; see the ANOVA table). The coefficients for the other two groups are the differences in the mean between the reference group and the other groups (see the computations in the Regression table) "), br (), HTML ("<center> The ANOVA reports each mean and a p-value that shows whether at least two of these means are significantly different. Similarly, a regression reports only one mean (as an intercept) and the difference between that one mean and all other means, with p-values evaluating these specific comparisons. </center>"), HTML ("<center> Both the F- and p-values for the ANOVA and Regression model are identical (34.54 and <0.001 respecitvely) </center>"), withMathJax(), "$$R^2=\\frac{SSm}{SSt}=\\frac{39,279}{54,632}=0.719$$" ), box(width=12, h3("Differences", align="center"), HTML("ANOVA is a tool to check how much the residual variance is reduced by predictors in (nested regression) models, whereas the regression analysis aims to quantify effect sizes in terms of 'how much is the response expected to change when the predictor(s) change by a given amount?' ►For categorical predictors this reduces to the question to 'what is the expected difference in the response between different groups/categories?' ► For continuous predictors this is the questions for a slope"), p ("To clarify: ANOVA can be applied to any regression model (no matter if the model contains only continuous, only categorical, or both kinds of predictors). ANOVA allows to assess the impact of a predictor or a whole set of predictors on the residuals: who much of the variance in the data can be explained by these predictors? The regression analysis, on the other hand, is a complementary tool to asses the quantitative relation between a predictor and the response.") ), box(width=6, HTML("<center>ANOVA Table</center>"), shiny::dataTableOutput('tableAnova')), box(width=6, HTML("<center>Regression Table</center>"), shiny::dataTableOutput('tableReg')), box(width=6, HTML("<center>ANOVA Output from R</center>"), verbatimTextOutput("anovaStats")), box(width=6, HTML("<center>Regression Output from R</center>"), verbatimTextOutput("regStats"))), ############## ## GLOSSARY ## ############## tabItem(tabName = "glossary", box(width=12, p("Coming Soon") ) ) ))))