Model Comparison

WHAT IS THIS ALL ABOUT?

In this app, you'll learn about how even somewhat complex models, such as those including an interaction term, still boil down to the basic equation:

Data = model + error

Last year, you participated in a marathon with some of your friends. You trained many hours and obtained a good result, but you would like to improve further. This time you're interested in exploring potential relationships between independent variables:

Whether training on the previous day is good or bad for performance;
Whether receiving a message on the previous day is good or bad for performance;
Whether there is any combination of these two factors that is particularly improves performance.

THE DATASET

So, let's first have a look at the information about last year's marathon on our variables of interest.

TEST YOUR KNOWLEDGE

Here is a challenge for you! Based on the data table above and the summary table on your right, try to answer these questions:

1.How many people took part in the marathon last year?

2.How much time did it take for participant number fifteen to complete the run?

3.Did participant 15 rest on the day before the marathon?

4.Did participant 16 have a massage on the day before the marathon?

5.What should one do on the day before the marathon?

6.Who was first to finish the race?

7.Who finished last?

SUMMARY

What have you learned so far?

The basic recipe for any model is: Data = model + error. By now you should be familiar with the first part of the recipe (Data):

You gained some insights into your research question and how to go to answer it.
You have seen the available data, and briefly explored it. But this is just the first step!

What are you going to learn next?

You are going to get to know the second part of the recipe: what is a MODEL? What is ERROR?
What is the null model?
What are the parameters of the null model?

Research question:
You aim at understanding what you should do to improve your marathon performance and you use the available data from last year to help you figure out how to improve.

YOUR FIRST GUESS

You are curious about how your friends Tom and Jerry performed last year, but you can't find their name in the dataset in which only participants'numbers are displayed. You don't have many information about them apart from the fact that they were there. What is the best guess you can make about their time?
You can now use your recipe to answer this question. The 'i' subscript indicates the individual partcipant.

Datai = model + errori

The best guess would probably be the most frequent time people spent to complete the marathon. In statistical language this is called 'the mode', and in special cases (e.g. when data are normally distributed), the mode equals the mean value. Let's assume that this is the case, you calculate the mean time and you find that it was about 4 hours and 15 minutes, so you go for that.

TimeJerry = mean time + errorJerry

TimeTom = mean time + errorTom

When you meet your friends, you decide to actually ask them for the real time. Jerry seems to be quite offended by your guess, he had performed way better! He made it in a little more than 2 hours and a half!! However, Tom was happily surprised by your guess, it took 4 hours and 9 minutes for him! Maybe you can improve your guess (model) by considering more information and adding it into your model.

WHAT IS AN ERROR?

Your prediction was better for Tom than for Jerry, so we can say that you made a greater error at predicting Jerry's time than Tom's. You can look at the next section to get a better grasp of what an error is. Look at the graph below. On the y-axis is represented last year participant's time. On the x-axis you find the participant number (id). Your model (the mean) is represented as a straight line, and it is the time that you would predict for each participant to finish the marathon. For a lot of participants, you made an error using this model. The red line represents the error you commited in your prediction for Jerry's time, whereas the green line is the error associated with the time you predicted for Tom.

As you can see the model is not that good. Maybe you can improve your guess (model) by considering more information to add into your model.

How do you wish to visualize it?

Scatterplot

Histogram(SSE)

TEST YOUR KNOWLEDGE

1.What is the sum of standard error when the null model is fitted?

2.What estimate is used to fit the null model?

3.What is your model trying to predict?

SUMMARY

What have you learned so far?
By now you should be familiar with the null model and the error in the model. The basic recipe for any model is: Datai = model + errori.

What are you going to learn next?

What is the proposed model?
What are the parameters of the proposed model?
How good is the model?

Research question:
Don't forget our research question! You want to understand what is important in order to succeed in a marathon.

IMPROVING YOUR GUESSES

As you saw in the previous section, your model was not very good at predicting your friends'performance. How can you improve it? In the dataset of last year's marathon, you had information about whether each participant rested or had a massage on the day before the marathon.

You decide to start by looking if resting on the day before the marathon might lead to better performance in the marathon. You can now build a new model with this information and see if your predictions improve:

Marathon Finish Time = Intercept + Effect of rest + Error

Yi = b0 + b1*x1 + Error

In which Yi = the outcome variable, b0 = the intercept or b0 coefficient, b1 = the slope or b1 coefficient, and X1 = the predictor variable.

Intercept (b0):
You assume that the relationship between the two variables (time and rest) can be described with a line. The intercept represents the performance (i.e., value on the y-axis) for someone who has not rested on the day before the marathon (i.e., has a value of 0 on the x-axis).

Effect of resting on the day before the marathon (b1):
You would like to see if resting on the day before the marathon is important for succeeding in the marathon. The answer to this question lies in the b1 coefficient. As rest is a dichotomous variable (i.e., a nominal/categorical variable with only two categories), in which 0 = no rest, 1 = rest, the slope represents the difference in the mean marathon performance between these two groups of participants (i.e., those who rested vs. those who did not rest).

FITTING THE MODEL TO THE DATA

Now you can visually evaluate the length of the line. The model seems to better predict the performance of both your friends (the error is reduced). But most of all, it seems that the predictor 'Rested day before' is significantly related to the outcome variable (performance at the marathon). To know if that's the case you will have to compute a significance test, which you will find in the inference section. At this point, you're interested in including more variables and relationships in your model, so please go to the next tab in this page, labeled "More complex models".

How do you wish to visualize it?

Scatterplot

Histogram

EVALUATING YOUR FIRST MODEL - Part I

Let us now directly compare the null model (mean finishing time) and the model with your best predictor - resting the day before. We will do so by visualising the null and the proposed model and how the additon of the predictor improves your model by visualising the reduction in sum squared errors (SSE).

Remember that you are comparing the following models:

Marathon Finish Time = Mean Finishing Time + Error

Marathon Finish Time = Intercept + Effect of rest + Error

MODEL DETAILS

In the graph below you can now see the two models: the red line, representing the Proposed Model 1, and the blue line, representing the null model. Which one of the two models allows you to make better predictions?

You can answer this question more precisely if you are able to quantify which model has the least error. You can take each participant's time and subtract the predicted time based on your model. You do this for each participant, and sum up the results. What do you obtain?

The result is zero! This does not help, in fact to avoid cancelling out opposite values and in order to give each one a weight, we have to square each difference before summing them up. At this point, we have the so called 'sum of squared errors (SSE)', which we need to divide for the number of participants to obtain the 'mean sum of squared errors' (MSSE).

To compare how good each model is, we need to calculate the so called 'R-squared' which is the ratio between the explained variation that the model explains (SSM) and the total variation (sum of SSE and SSM).

REGRESSION vs ONE-WAY ANOVA

Now we wil explore two cases of General Linear Models: a simple regression and a One-Way ANOVA

A very simple explanation is the following:
►Regression is the statistical model that you use to predict a continuous outcome on the basis of one or more continuous predictor variables
► The One-way ANOVA is the statistical model that you use to predict a continuous outcome on the basis of one or more categorical predictor variables.

Similarities

1. First, both models are applicable only when you have a continuous outcome variable. A categorical outcome variable would rule out the use of either a regression model or a One-way ANOVA.

2. Second, you can use the regression algorithm, which is based on the principle of least squares, to fit an ANOVA model.

3. Third, the concept of partitioning variation into sums of squares (SS) in an ANOVA model also provides a nice way to examine complex regression models.

In an ANOVA:
the categorical variable is effect coded, which means that each categosy's mean is compared to the grand mean

In a Regression:
the categorical variable is dummy coded, which means that each category's intercept is compared to the reference group's intercept. Since the intercept is defined as the mean value when all the other predictors are equal to zero, and there are no other predictors, the three intercepts are just means.

Look at the outputs below

The intercept in the regression model is simply the mean of the reference group (i.e. the American; see the ANOVA table). The coefficients for the other two groups are the differences in the mean between the reference group and the other groups (see the computations in the Regression table)

The one-way ANOVA reports each mean and a p-value that shows whether these means are significantly different from each other. Similarly, a regression reports only one mean (as an intercept) and the difference between that one mean and other means, with p-value evaluating these specific comparison.

Both the F- and p-values for the ANOVA and Regression model are identical (34.54 and <0.001 respecitvely)

Differences

The One-way ANOVA is a tool to check how much the residual variance is reduced by predictors in (nested regression) models, whereas the regression analysis aims to quantify effect sizes in terms of 'how much is the response expected to change when the predictor(s) change by a given amount?'
►For categorical predictors this reduces to the question to 'what is the expected difference in the response between different groups/categories?'
► For continuous predictors this is the questions for a slope

To clarify: a One-way ANOVA can be applied to any regression model (no matter if the model contains only continuous, only categorical, or both kinds of predictors). A One-way ANOVA allows to assess the impact of a predictor or a whole set of predictors on the residuals: who much of the variance in the data can be explained by these predictors? The regression analysis, on the other hand, is a complementary tool to asses the quantitative relation between a predictor and the response.

ANOVA Table

Regression Table

ANOVA Output from R

Regression Output from R

EVALUATING YOUR FIRST MODEL - Part II

Test statistic = Estimated effect / Error of the estimation

This is the basic equation underlying test statistics. Test statistics basically represent an effect in the unit if a specific theoretical distribution (here, T-Student and F distributions). When we compute a test statistics we can evaluate how likely that value (or a more extreme value) is to occur. And with this we can say if an effect is happening by chance or not.

REPORTING RESULTS

Here's how R would print the results of our regression:

How would you report your results (APA style)? The important effect here is given by the b1. b0 only tests the difference between the group that is coded 0 in our predictor variable (in the current case, no rest) and zero, so it is usually of little interest. So the report of the results will be on these this parameter. There are two different ways to do this:

What about R2? Shouldn't we be reporting R2 as well?

FURTHER IMPROVING YOUR GUESSES

When you use the information about whether or not the participants had a massage on the day before to estimate their marathon finishing time, the model looks like this:

Marathon Finish Time = Intercept + Effect of rest + Effect of massage + Error

Yi = b0 + b1*x1 + b2*x2 + Error

FITTING THE MODEL TO THE DATA

Below, you can find a table and graph for the proposed model with two predictors (rest and massage). Notice how instead of one line, as in the null model, or two lines, as in the proposed model with only one predictor, we now have four lines, each representing the expected value for each combination of the two dichotomous predictor variables. For example, a participant who rested and who had a massage on the day before the marathon is expected to have a worse performance (i.e., is expected to take longer) than a participant who didn't rest and had a massage.

How do you wish to visualize it?

Scatterplot

Histogram (SSE)

Is this model better or worse than the previous one, in which we used only rest as a predictor?

EVALUATING YOUR SECOND PROPOSED MODEL - Part I

Let us now directly compare the null model (mean finishing time) and the model with two predictors - resting the day before and having a massage. We will do so by visualising the null and the proposed model with two predictors and how the additon of these predictors improve your model by visualising the reduction in sum squared errors (SSE).

Remember that you are comparing the following models:

Marathon Finish Time = Mean Finishing Time + Error

Marathon Finish Time = Intercept + Effect of rest + Effect of massage + Error

MODEL DETAILS

In the graph below you can now see two models: the blue line, representing the null model, and a set of four lines: purple, red, yellow, and green, representing the proposed model with two predictors. Remember that the four lines in the proposed model now represent the expected value depending on the combination of values of the participant in the two predictor variables. By clicking the buttons above the graph you can see how the distances change depending on which model (i.e., line or set of lines) is used to predict the data.

Based on the scatterplot and the histogram, which one of the two models allows you to make better predictions?

EVALUATING YOUR SECOND PROPOSED MODEL - Part II

Test statistic = Estimated effect / Error of the estimation

REPORTING RESULTS

Here's how R would print the results of our regression:

How would you report your results (APA style)? The important effects here are given by the b1 and the b2 parameters. b0 only tests the difference between the group that is coded 0 in all dichotomous variables (in the current case, no rest and no massage) and zero, so it is usually of little interest. So the report of the results will be on these two parameters. There are two different ways to do this:

FURTHER IMPROVING YOUR GUESSES

When you use the information about both predictor and add the interaction between them to estimate marathon finishing time, the model looks like this:

Marathon Finish Time = Intercept + Effect of rest + Effect of massage + (Effect of rest * Effect of massage) + Error

Yi = b0 + b1*x1 + b2*x2 + b3(x1*x2) + Error

FITTING THE MODEL TO THE DATA

How do you wish to visualize it?

Scatterplot

Histogram (SSE)

Is this model better or worse than the previous one, in which we did not consider the interaction between both variables?

EVALUATING YOUR THIRD PROPOSED MODEL - Part I

Let us now directly compare the null model and the model with two predictors and the interaction between them. We will do so by visualising both models and how the additon of the interaction improves your model by visualising the reduction in sum squared errors (SSE).

Remember that you are comparing the following models:

Marathon Finish Time = Mean Finishing Time + Error

Marathon Finish Time = Intercept + Effect of rest + Effect of massage + (Effect of rest * Effect of massage) + Error

MODEL DETAILS

Based on the scatterplot and the histogram, which one of the models allows you to make better predictions?

REGRESSION vs TWO-WAY ANOVA

Now we wil explore two cases of General Linear Models: a multiple regression and a two-way ANOVA

A very simple explanation is the following:
► Multiple regression is the statistical model that you use to predict a continuous outcome on the basis of more than one continuous predictor variable.
► The two-way ANOVA is the statistical model that you use to predict a continuous outcome on the basis of two categorical predictor variables.

Similarities

1. First, both models are applicable only when you have a continuous outcome variable. A categorical outcome variable would rule out the use of both a regression model or an ANOVA.

2. Second, you can use the regression algorithm, which is based on the principle of least squares, to fit an ANOVA model.

3. Third, the concept of partitioning variation into sums of squares (SS) in an ANOVA model also provides a nice way to examine complex regression models.

In an ANOVA:
the categorical variable is effect coded, which means that each categosy's mean is compared to the grand mean.

In a regression:
the categorical variable is dummy coded, which means that each category's intercept is compared to the reference group's intercept. Since the intercept is defined as the mean value when all the other predictors are equal to zero, and there are no other predictors, the three intercepts are just means.

Look at the outputs below

Both the F- and p-values for the ANOVA and Regression model are identical (34.54 and <0.001 respecitvely)

Differences

ANOVA Table

Regression Table

ANOVA Output from R

Regression Output from R

EVALUATING YOUR THIRD PROPOSED MODEL - Part II

Test statistic = Estimated effect / Error of the estimation

REPORTING RESULTS

Here's how R would print the results of our regression:

How would you report your results (APA style)? The important effects here are given by b1, b2, and the interaction parameters. Again, b0 only tests the difference between the group that is coded 0 in all dichotomous variables (in the current case, no rest and no massage) and zero, so it is usually of little interest. So the report of the results will be on these three parameters. There are two different ways to do this:

GLOSSARY

Here you will find the definitions of the key statistical terms that you have covered whilst using this app.

RECOMMENDED READING

Judd, C. M., & McClelland, G. H. (2017). Data analysis: A model comparison approach (3rd ed.). San Diego, CA: Harcourt, Brace, Jovanovich.

#####PACKAGES########## library(shiny) library(ggplot2) library(calibrate) library(MASS) library(shinyBS) library(broom) library(DT) library(shinydashboard) #######SERVER############### server = function(input, output, session) { ############################### ######## GENERATE DATA ######## ############################### # Set a seed to obtain the same data everytime set.seed(1234) id <- 1:28 x1 <- rep(c(0, 1), each=14) # rest x1l <- rep(c("no", "yes"), each=14) x2 <- c(rep(c(0, 1), each=7), rep(c(0, 1), each=7)) # massage x2l <- c(rep(c("no", "yes"), each=7), rep(c("no", "yes"), each=7)) # massage x3 <- x1 * x2 # interaction xboth <- c(rep(c("None"), each=7), rep(c("Mass."), each=7), rep(c("Rested"), each=7), rep(c("Both"), each=7)) # massage # Baseline marathon performance y <- round(rnorm(28, 400, 40)) # Now add main effect of rest, main effect of massage, and interaction y <- y + round(rnorm(28, (80*x1), 2)) y <- y + round(rnorm(28, (-80*x2), 2)) y <- y + round(rnorm(28, (80*x1*x2), 2)) # Now, bind them in one dataframe df <- data.frame(cbind(id, y, x1, x2, x3)) dfToDisplay <- data.frame(cbind(id, y, x1l, x2l, xboth)) # Now we define some labels to our variables idlabel <- "Participant" ylabel <- "Finishing time [mins]" x1label <- "Rest" x2label <- "Massage" xbothlabel <- "Rest & Massage" # And set the upper limit and lower limit of the plots topLimit <- round(max(y)+75, -2) botLimit <- round(min(y)-75, -2) ################################## ######## GET USER CHOICES ######## ################################## rv <- reactiveValues(getmodel = 1) rv2 <- reactiveValues(getmodel2 = 1) rv3 <- reactiveValues(getmodel3 = 1) observeEvent(input$null, {rv$getmodel <- 1 }) observeEvent(input$full, {rv$getmodel <- 2 }) observeEvent(input$sse, {rv$getmodel <- 3 }) observeEvent(input$null2, {rv2$getmodel2 <- 1 }) observeEvent(input$full2, {rv2$getmodel2 <- 2 }) observeEvent(input$sse2, {rv2$getmodel2 <- 3 }) observeEvent(input$null3, {rv3$getmodel3 <- 1 }) observeEvent(input$full3, {rv3$getmodel3 <- 2 }) observeEvent(input$sse3, {rv3$getmodel3 <- 3 }) getplot <- reactive(switch(input$plot_type_null, scatter = "0", hist = "1")) getplotprop1 <- reactive(switch(input$plot_type_prop1, scatter = "0", hist = "1")) getplotprop2 <- reactive(switch(input$plot_type_prop2, scatter = "0", hist = "1")) getplotprop3 <- reactive(switch(input$plot_type_prop3, scatter = "0", hist = "1")) getplotS <-reactive(switch(input$plots, tstat = "0", fstat = "1")) ########################################### ######## MODEL VARIABLES AND STATS ######## ########################################### # X models xmod <- lm(df$y ~ df$x1) xmod.2 <- lm(df$y ~ df$x1 + df$x2) xmod.3 <- lm(df$y ~ df$x1 * df$x2) # Get the summary outputs into a data.frame xmodel <- as.data.frame(tidy(xmod)) xmodel.2 <-as.data.frame( tidy(xmod.2)) xmodel.3 <- as.data.frame(tidy(xmod.3)) # Add col and row names colnames(xmodel) <- (c("Term", "Estimate", "Std. Error", "t value", "p-value")) rownames(xmodel) <- (c("Intercept (b0)", "Slope (b1)")) colnames(xmodel.2) <- (c("Term", "Estimate", "Std. Error", "t value", "p-value")) rownames(xmodel.2) <- (c("Intercept (b0)", "Slope (b1)", "Slope (b2)")) colnames(xmodel.3) <- (c("Term", "Estimate", "Std. Error", "t value", "p-value")) rownames(xmodel.3) <- (c("Intercept (b0)", "Slope (b1)", "Slope (b2)", "Slope (interaction)")) # Error/residuals for the null model Ymodelerror <- round(df$y - mean(df$y), 2) # Compute the expected means for the four groups in Proposed Model 2 expM1.2 <- xmodel.2[1,2] expM2.2 <- xmodel.2[1,2] + xmodel.2[3,2] expM3.2 <- xmodel.2[1,2] + xmodel.2[2,2] expM4.2 <- xmodel.2[1,2] + xmodel.2[2,2] + xmodel.2[3,2] #### PARAMETERS ESTIMATION #### # b0 b0 <- xmodel[1,2] b0.2 <- xmodel.2[1,2] b0.3 <- xmodel.3[1,2] # b1s b1 <- xmodel[2,2] b1.2 <- xmodel.2[2,2] b1.3 <- xmodel.3[2,2] # b2s b2.2 <- xmodel.2[3,2] b2.3 <- xmodel.3[3,2] # Interaction b3.3 <- xmodel.3[4,2] #### PROPOSED MODEL ERROR #### # We can calculate both the prediction and, hence, the proposed model error: prediction <- round(b0 + df$x1*b1, 2) fullmodelerror <- round(residuals(xmod), 2) prediction.2 <- round(b0.2 + df$x1*b1.2 + df$x2*b2.2, 2) fullmodelerror.2 <- round(residuals(xmod.2), 2) prediction.3 <- round(b0.3 + df$x1*b1.3 + df$x2*b2.3 + (df$x1*df$x2)*b3.3, 2) fullmodelerror.3 <- round(residuals(xmod.3), 2) # Sum of squared errors for all models, null, x, and proposed: SSEnull <- sum(Ymodelerror^2) SSEfull <- sum(fullmodelerror^2) SSEfull.2 <- sum(fullmodelerror.2^2) SSEfull.3 <- sum(fullmodelerror.3^2) # We take this opportunity to set the top limit of the histograms histTop <- round(SSEnull, -4) # Sum of squared residuals: SSR <- sum(prediction - mean(df$y))^2 SSR.2 <- sum(prediction.2 - mean(df$y))^2 SSR.3 <- sum(prediction.3 - mean(df$y))^2 SS <- c(SSEnull, SSEfull, SSEnull - SSEfull) SS.2 <- c(SSEnull, SSEfull.2, SSEnull - SSEfull.2) SS.3 <- c(SSEnull, SSEfull.3, SSEnull - SSEfull.3) #### DATA ANALYSIS - B0 t-test statistics #### # Standard error of the full model se <- sigma(xmod) se.2 <- sigma(xmod.2) se.3 <- sigma(xmod.3) # Standard error of b0 seb0 <- xmodel[1, 3] seb0.2 <- xmodel.2[1, 3] seb0.3 <- xmodel.3[1, 3] # t-value of b0 b0_t_statistics <- xmodel[1, 4] b0_t_statistics.2 <- xmodel.2[1, 4] b0_t_statistics.3 <- xmodel.3[1, 4] # p-value of b0 P_v0 <- xmodel[1, 5] P_v0.2 <- xmodel.2[1, 5] P_v0.3 <- xmodel.3[1, 5] #### DATA ANALYSIS - B1 t-test statistics #### # Standard error of the x model seb1 <- xmodel[2, 3] seb1.2 <- xmodel.2[2, 3] seb1.3 <- xmodel.3[2, 3] # t-value of b1 b1_t_statistics <- xmodel[2, 4] b1_t_statistics.2 <- xmodel.2[2, 4] b1_t_statistics.3 <- xmodel.3[2, 4] # t-value of b2 b2_t_statistics.2 <- xmodel.2[3, 4] b2_t_statistics.3 <- xmodel.3[3, 4] # t-value of b3 b3_t_statistics.3 <- xmodel.3[4, 4] # p-value of b1 P_v1 <- xmodel[2, 5] P_v1.2 <- xmodel.2[2, 5] P_v1.3 <- xmodel.3[2, 5] #### F STATISTICS FOR THE PROPOSED MODEL #### # F STATISTICS F_sta <- anova(xmod)[[4]][1] F_sta.2 <- anova(xmod.2)[[4]][1] F_sta.3 <- anova(xmod.3)[[4]][1] # P-VALUE P_v <- anova(xmod)[[5]][1] P_v.2 <- anova(xmod.2)[[5]][1] P_v.3 <- anova(xmod.3)[[5]][1] # F statistic dfs(1, 2) dfF1 <- anova(xmod)[[1]][1] dfF1 <- anova(xmod)[[1]][1] dfF1 <- anova(xmod)[[1]][1] dfF2 <- anova(xmod)[[1]][2] dfF2.2 <- anova(xmod.2)[[1]][2] dfF2.3 <- anova(xmod.3)[[1]][2] # Sample size df dft <- length(df$x1) - 1 dft.2 <- length(df$x2) - 1 dft.3 <- length(df$x3) - 1 # covariance Covxy <- cov(df$x1, df$y) Covxy.2 <- cov(df$x2, df$y) Covxy.3 <- cov(df$x3, df$y) # creating a data.frame that contains the mean of the data to plot error for the null meany <- rep(mean(df$y), 28) ## creating the proposed model 2 m2 <- summary(lm(df$y~df$x1+df$x2)) m2t <- data.frame(m2$coefficients) b0_coef2 <- m2t[1,1] b1_coef2 <- m2t[2,1] b2_coef2 <- m2t[3,1] ## creating the proposed model 3 m3 <- summary(lm(df$y~df$x1+df$x2+df$x2*df$x1)) m3t <- data.frame(m3$coefficients) b0_coef3 <- m3t[1,1] b1_coef3 <- m3t[2,1] b2_coef3 <- m3t[3,1] b12_coef3 <- m3t[4,1] ########################### ###### TABLE HEADERS ###### ########################### nullmodlabel <- HTML('<a data-toggle="popover" data-trigger="focus" class="bootpop" title="Null model predictions">Null Y\'i</a>') nullreslabel <- HTML('<a data-toggle="popover" data-trigger="focus" class="bootpop" title="Null model errors">Null e\'i</a>') fullmodlabel <- HTML('<a data-toggle="popover" data-trigger="focus" class="bootpop" title="Proposed model predictions">Proposed Y\'i</a>') fullreslabel <- HTML('<a data-toggle="popover" data-trigger="focus" class="bootpop" title="Proposed model errors">Proposed e\'i</a>') ######################### ### Next page buttons ### ######################### observeEvent(input$Next1, {updateTabItems(session, "tabs", "nullmodel")}) observeEvent(input$Next2, {updateTabItems(session, "tabs", "propmodel")}) observeEvent(input$Next3, {updateTabItems(session, "tabs", "comparison")}) observeEvent(input$Next4, {updateTabItems(session, "tabs", "inference")}) ################################### #### INTRODUCTION page outputs #### ################################### output$tableData <- shiny::renderDataTable({ cc <- data.frame(df$id, df$y, dfToDisplay$x1l, dfToDisplay$x2l) colnames(cc) <- c(idlabel, ylabel, x1label, x2label) cc }, options = list(searching=F, pageLength=10, lengthChange=F), escape=F) ################################# #### NULL model page outputs #### ################################# #### Null - TABLES #### output$tableNull <- shiny::renderDataTable({ aa <-data.frame(df$id, df$y, round(mean(df$y), 2), Ymodelerror) colnames(aa) <- c(idlabel, ylabel, nullmodlabel, nullreslabel) aa }, options = list(searching=F, pageLength=10, lengthChange=F), escape = FALSE) #### Null - SCATTERPLOT #### output$plotNull <- renderPlot({ if (getplot()==0) { plot(data.frame(df$id, df$y), xlab = idlabel, ylab = ylabel, pch=16, ylim=c(botLimit, topLimit), xlim=c(1, 28), # make Tom and Jerry's points bigger and colorful col = ifelse(df$id == 7, "green3" , ifelse(df$id == 13, "red", "black")) ) abline(h = mean(df$y), col="#337ab7") segments(df$id, df$y, df$id, meany, col = ifelse(df$id == 7, "green3" , ifelse(df$id == 13, "red", "black"))) text(8.5, df$y[7], "Tom :)", col = "green3") text(14.5, df$y[13], "Jerry :(", col = "red") } #### Null - HISTOGRAM #### else { barplot(SSEnull,col="blue", names.arg =("Null model error"), ylab = "SSE", xlim = c(0, 4), ylim=c(0, histTop)) legend("topright", inset=.05, cex = 1, title="SSE", paste(round(SSEnull,2)), horiz=FALSE, lty=c(1,1), lwd=c(2,2), col=c("blue")) }}) #### Null - CAPTIONS #### output$Cap <- renderText({ #### If user selected only data #### if (getplot()==0) { paste("This plot shows the finishing time for each particpant with an addition of the line that represents the null model.The black lines from each data point to the line represent the residual error for each point if the null model is fitted.", sep = "") } else { #### If user selected the histogram #### paste("This histogram shows the Sum Squared Errors (SSE) of the null model", sep = "") } }) ############################### #### PROPOSED MODEL TABLES #### ############################### output$tableProposed <- shiny::renderDataTable({ aa <-data.frame(df$id, df$y, dfToDisplay$x1l,prediction,fullmodelerror) colnames(aa) <- c(idlabel, ylabel, x1label, fullmodlabel, fullreslabel) aa }, options = list(searching=F, pageLength=10, lengthChange=F), escape=F) output$tableProposed.2 <- shiny::renderDataTable({ aa <-data.frame(df$id, df$y, dfToDisplay$x1l, dfToDisplay$x2l, prediction.2, fullmodelerror.2) colnames(aa) <- c(idlabel, ylabel, x1label, x2label, fullmodlabel, fullreslabel) aa }, options = list(searching=F, pageLength=10, lengthChange=F), escape=F) output$tableProposed.3 <- shiny::renderDataTable({ aa <-data.frame(df$id, df$y, dfToDisplay$xboth, prediction.3, fullmodelerror.3) colnames(aa) <- c(idlabel, ylabel, xbothlabel, fullmodlabel, fullreslabel) aa }, options = list(searching=F, pageLength=10, lengthChange=F), escape=F) ############################################# #### PROPOSED MODEL PRESENTATION OUTPUTS #### ############################################# #### Proposed Model 1 - SCATTERPLOT #### output$plotProposed <- renderPlot({ if (getplotprop1()==0) { plot(data.frame(df$id , df$y), col=c("red", "green")[dfToDisplay$x1l], xlab = idlabel, ylab = ylabel, pch= 16, ylim=c(botLimit, topLimit), xlim=c(1, 28)) segments(1, mean(df[0:14,]$y), 14, mean(df[0:14,]$y), col="red") segments(15, mean(df[15:28,]$y), 28, mean(df[15:28,]$y), col="green") legend(x="bottomright", legend = c("didn't rest", "rested"), col=c("red","green"), pch=16) segments(df[0:14,]$id, df[0:14,]$y, df[0:14,]$id, mean(df[0:14,]$y), col = "red") segments(df[15:28,]$id, df[15:28,]$y, df[15:28,]$id, mean(df[15:28,]$y), col = "green") } #### Proposed Model 1- HISTOGRAM #### else { barplot(SSEfull,col="blue", names.arg =("Proposed model error"), ylab = "SSE", xlim = c(0, 4), ylim=c(0, histTop)) legend("topright", inset=.05, cex = 1, title="SSE", legend = round(SSEfull,2), horiz=FALSE, lty=c(1,1), lwd=c(2,2), col="blue") } }) #### Proposed Model 1 - PLOT CAPTIONS #### output$Cap.P1 <- renderText({ if (getplotprop1()==0) { paste("This plot shows the finishing time for each particpant with an addition of the line of best fit with ",x1label," as a predictor variable. The lines from each data point to the line represent the residual error for each point if this model is fitted.", sep = "") } else { paste("This histogram shows the Sum Squared Errors (SSE) when the model with ",x1label," as a predictor is fitted.", sep = "") } }) #### Proposed Model 2 - COLOR SCHEME #### col1 <- "#FF00FF" col2 <- "#FF3366" col3 <- "#FFCC33" col4 <- "#00FF00" cols <- c(col1, col2, col3, col4) #### Proposed Model 2 - SCATTER PLOT #### output$plotProposed.2 <- renderPlot({ if (getplotprop2()==0) { indexP <- c(rep(1, 7), rep(2, 7), rep(3, 7), rep(4,7)) plot(data.frame(df$id , df$y), col=c(col1, col2, col3, col4)[indexP], xlab = idlabel, ylab = ylabel, pch= 16, ylim=c(botLimit, topLimit), xlim=c(1, 28)) segments(1, expM1.2, 7, expM1.2, col=col1) segments(8, expM2.2, 14, expM2.2, col=col2) segments(15, expM3.2, 21, expM3.2, col=col3) segments(22, expM4.2, 28, expM4.2, col=col4) legend(x="bottomright", legend = c("didn't rest, didn't have a massage", "didn't rest, had a massage", "rested, didn't have a massage", "rested, had a massage"), col=c(col1, col2, col3, col4), pch=16) segments(df[1:7,]$id, df[1:7,]$y, df[1:7,]$id, expM1.2, col = col1) segments(df[8:14,]$id, df[8:14,]$y, df[8:14,]$id, expM2.2, col = col2) segments(df[15:21,]$id, df[15:21,]$y, df[15:21,]$id, expM3.2, col = col3) segments(df[22:28,]$id, df[22:28,]$y, df[22:28,]$id, expM4.2, col = col4) } #### Proposed Model 2 - HISTOGRAM #### else { barplot(SSEfull.2,col="blue", names.arg = paste("Two predictors", "model error"), ylab = "SSE", xlim = c(0, 4), ylim=c(0, histTop)) legend("topright", inset=.05, cex = 1, title="SSE", legend = round(SSEfull.2,2), horiz=FALSE, lty=c(1,1), lwd=c(2,2), col="blue") } }) #### Proposed Model 2 - PLOT CAPTIONS #### output$Cap.P2 <- renderText({ if (getplotprop2()==0) { paste("This plot shows the finishing time for each particpant with the addition of two variables as a predictors: ",x1label," and ",x2label,".", sep = "") } else { #### If user selected the histogram #### paste("This histogram shows the Sum Squared Errors (SSE) when a model with ",x1label," and ",x2label," as predictors is fitted.", sep = "") } }) #### Proposed Model 3 - SCATTER PLOT #### output$plotProposed.3 <- renderPlot({ if (getplotprop3()==0) { indexP <- c(rep(1, 7), rep(2, 7), rep(3, 7), rep(4,7)) plot(data.frame(df$id , df$y), col=c(col1, col2, col3, col4)[indexP], xlab = idlabel, ylab = ylabel, pch= 16, ylim=c(botLimit, topLimit), xlim=c(1, 28)) segments(1, mean(df[1:7,]$y), 7, mean(df[1:7,]$y), col=col1) segments(8, mean(df[8:14,]$y), 14, mean(df[8:14,]$y), col=col2) segments(15, mean(df[15:21,]$y), 21, mean(df[15:21,]$y), col=col3) segments(22, mean(df[22:28,]$y), 28, mean(df[22:28,]$y), col=col4) legend(x="bottomright", legend = c("didn't rest, didn't have a massage", "didn't rest, had a massage", "rested, didn't have a massage", "rested, had a massage"), col=c(col1, col2, col3, col4), pch=16) segments(df[1:7,]$id, df[1:7,]$y, df[1:7,]$id, mean(df[1:7,]$y), col = col1) segments(df[8:14,]$id, df[8:14,]$y, df[8:14,]$id, mean(df[8:14,]$y), col = col2) segments(df[15:21,]$id, df[15:21,]$y, df[15:21,]$id, mean(df[15:21,]$y), col = col3) segments(df[22:28,]$id, df[22:28,]$y, df[22:28,]$id, mean(df[22:28,]$y), col = col4) } #### Proposed Model 3 - HISTOGRAM #### else { barplot(SSEfull.3,col="blue", names.arg = paste("Interaction", "model error"), ylab = "SSE", xlim = c(0, 4), ylim=c(0, histTop)) legend("topright", inset=.05, cex = 1, title="SSE", legend = round(SSEfull.3,2), horiz=FALSE, lty=c(1,1), lwd=c(2,2), col="blue") } }) #### Proposed Model 3 - PLOT CAPTIONS #### output$Cap.P3 <- renderText({ #### If user selected only data #### if (getplotprop3()==0) { paste("This plot shows the finishing time for each particpant with the addition of the interaction between two predictors: ",x1label," and ",x2label,".", sep = "") } else { #### If user selected the histogram #### paste("This histogram shows the Sum Squared Errors (SSE) when a model with the interaction of ",x1label," and ",x2label," is fitted.", sep = "") } }) ########################################### #### PROPOSED MODEL COMPARISON OUTPUTS #### ########################################### #### Proposed Model 1 - COMPARISON TABLE #### output$tableComparison <- shiny::renderDataTable({ aa <-data.frame(df$id, df$y, x1l, round(mean(df$y), 2), Ymodelerror, prediction,fullmodelerror) colnames(aa) <- c(idlabel, ylabel, x1label, nullmodlabel, nullreslabel, fullmodlabel, fullreslabel) aa }, options = list(searching=F, pageLength=10, lengthChange=F), escape=F) #### Proposed Model 1 - COMPARISON SCATTER PLOT #### output$plotComparison <- renderPlot({ # Null selected if (rv$getmodel==1) { plot(data.frame(df$id , df$y), col=c("red", "green")[dfToDisplay$x1l], xlab = idlabel, ylab = ylabel, pch= 16, ylim=c(botLimit, topLimit), xlim=c(1, 28)) abline(h = mean(df$y), col="#337ab7") segments(df$id, df$y, df$id, meany, col = "#337ab7") legend(x="bottomright", legend = c("didn't rest", "rested"), col=c("red","green"), pch=16) } # Full selected else if (rv$getmodel==2){ plot(data.frame(df$id , df$y), col=c("red", "green")[dfToDisplay$x1l], xlab = idlabel, ylab = ylabel, pch= 16, ylim=c(botLimit, topLimit), xlim=c(1, 28)) segments(1, mean(df[0:14,]$y), 14, mean(df[0:14,]$y), col="red") segments(15, mean(df[15:28,]$y), 28, mean(df[15:28,]$y), col="green") legend(x="bottomright", legend = c("didn't rest", "rested"), col=c("red","green"), pch=16) segments(df[0:14,]$id, df[0:14,]$y, df[0:14,]$id, mean(df[0:14,]$y), col = "red") segments(df[15:28,]$id, df[15:28,]$y, df[15:28,]$id, mean(df[15:28,]$y), col = "green") abline(h = mean(df$y), col="#337ab7") } #### Proposed Model 1 - COMPARISON HISTOGRAM #### else if (rv$getmodel==3) { diff = SSEnull-SSEfull t2 = matrix(c(SSEfull,diff)) tx <- barplot(SSEnull,width= 1,space = 0, col=c("#337ab7"), ylab = "SSE",beside=FALSE,xlim=c(0,4), names.arg =(c("Null model \n error")), ylim=c(0, histTop)) txx <-barplot(t2,width=1,space = 1.5 ,col=c("red","#DCDCDC"),ylab = "SSE", beside = FALSE, add = TRUE,xlim=c(0,4), names.arg =(c("Error of model \n with 1 predictor")), ylim=c(0, max(SSEnull))) text(x = txx, y = SSEfull+diff/2, "Error \n reduction") legend("topright", inset=.01, cex = 1, title="SSE", legend = round(c(SSEnull, SSEfull, diff),2), horiz=FALSE, lty=c(1,1), lwd=c(2,2), col=c("#337ab7","red","#DCDCDC")) }}) #### Proposed Model 2 - COMPARISON TABLE #### output$tableComparison2 <- shiny::renderDataTable({ aa <-data.frame(df$id, df$y, dfToDisplay$xboth, round(mean(df$y), 2), Ymodelerror, prediction.2, fullmodelerror.2) colnames(aa) <- c(idlabel, ylabel, xbothlabel, nullmodlabel, nullreslabel, fullmodlabel, fullreslabel) aa }, options = list(searching=F, pageLength=10, lengthChange=F), escape=F) #### Proposed Model 2 - COMPARISON SCATTER PLOT #### output$plotComparison2 <- renderPlot({ # Null selected if (rv2$getmodel2==1) { indexP <- c(rep(1, 7), rep(2, 7), rep(3, 7), rep(4,7)) plot(data.frame(df$id , df$y), col=cols[indexP], xlab = idlabel, ylab = ylabel, pch= 16, ylim=c(botLimit, topLimit), xlim=c(1, 28)) abline(h = mean(df$y), col="#337ab7") legend(x="bottomright", legend = c("didn't rest, didn't have a massage", "didn't rest, had a massage", "rested, didn't have a massage", "rested, had a massage"), col=cols, pch=16) #segments to null segments(df$id, df$y, df$id, mean(df$y), col = "blue") } # Full selected else if (rv2$getmodel2==2){ indexP <- c(rep(1, 7), rep(2, 7), rep(3, 7), rep(4,7)) plot(data.frame(df$id , df$y), col=cols[indexP], xlab = idlabel, ylab = ylabel, pch= 16, ylim=c(botLimit, topLimit), xlim=c(1, 28)) #null abline(h = mean(df$y), col="#337ab7") #proposed segments(1, expM1.2, 7, expM1.2, col=col1) segments(8, expM2.2, 14, expM2.2, col=col2) segments(15, expM3.2, 21, expM3.2, col=col3) segments(22, expM4.2, 28, expM4.2, col=col4) legend(x="bottomright", legend = c("didn't rest, didn't have a massage", "didn't rest, had a massage", "rested, didn't have a massage", "rested, had a massage"), col=cols, pch=16) segments(df[1:7,]$id, df[1:7,]$y, df[1:7,]$id, expM1.2, col = col1) segments(df[8:14,]$id, df[8:14,]$y, df[8:14,]$id, expM2.2, col = col2) segments(df[15:21,]$id, df[15:21,]$y, df[15:21,]$id, expM3.2, col = col3) segments(df[22:28,]$id, df[22:28,]$y, df[22:28,]$id, expM4.2, col = col4) } #### Proposed Model 2 - COMPARISON HISTOGRAM #### else if (rv2$getmodel2==3) { diff2 = SSEnull-SSEfull.2 t3 = matrix(c(SSEfull.2,diff2)) tx <- barplot(SSEnull, width=1, space=0, col=c("#337ab7"), ylab = "SSE",beside=FALSE,xlim=c(0,4), names.arg =(c("Null model \n error")), ylim=c(0, histTop)) txxx <- barplot(t3, width=1, space=1.5 ,col=c("red","#DCDCDC"),ylab = "SSE", beside = FALSE, add = TRUE,xlim=c(0,4), names.arg =(c("Error of model \n with 2 predictors")), ylim=c(0, max(SSEnull))) text(x = txxx, y = SSEfull.2+diff2/2, "Error \n reduction") legend("topright", inset=.01, cex = 1, title="SSE", legend = round(c(SSEnull, SSEfull.2, diff2),2), horiz=FALSE, lty=c(1,1), lwd=c(2,2), col=c("#337ab7","red","#DCDCDC")) }}) #### Proposed Model 3 - COMPARISON TABLE #### output$tableComparison3 <- shiny::renderDataTable({ aa <-data.frame(df$id, df$y, dfToDisplay$xboth, round(mean(df$y), 2), Ymodelerror, prediction.3, fullmodelerror.3) colnames(aa) <- c(idlabel, ylabel, xbothlabel, nullmodlabel, nullreslabel, paste("Interaction ",fullmodlabel, sep = ""), paste("Interaction ",fullreslabel, sep = "")) aa }, options = list(searching=F, pageLength=10, lengthChange=F), escape=F) #### Proposed Model 3 - COMPARISON SCATTERPLOT #### output$plotComparison3 <- renderPlot({ if (rv3$getmodel3==1) { indexP <- c(rep(1, 7), rep(2, 7), rep(3, 7), rep(4,7)) plot(data.frame(df$id , df$y), col=cols[indexP], xlab = idlabel, ylab = ylabel, pch= 16, ylim=c(botLimit, topLimit), xlim=c(1, 28)) abline(h = mean(df$y), col="#337ab7") legend(x="bottomright", legend = c("didn't rest, didn't have a massage", "didn't rest, had a massage", "rested, didn't have a massage", "rested, had a massage"), col=cols, pch=16) #segments to null segments(df$id, df$y, df$id, mean(df$y), col = "blue") } # Full selected if (rv3$getmodel3==2){ indexP <- c(rep(1, 7), rep(2, 7), rep(3, 7), rep(4,7)) plot(data.frame(df$id , df$y), col=cols[indexP], xlab = idlabel, ylab = ylabel, pch= 16, ylim=c(botLimit, topLimit), xlim=c(1, 28)) #null abline(h = mean(df$y), col="#337ab7") #proposed segments(1, mean(df[1:7,]$y), 7, mean(df[1:7,]$y), col=col1) segments(8, mean(df[8:14,]$y), 14, mean(df[8:14,]$y), col=col2) segments(15, mean(df[15:21,]$y), 21, mean(df[15:21,]$y), col=col3) segments(22, mean(df[22:28,]$y), 28, mean(df[22:28,]$y), col=col4) legend(x="bottomright", legend = c("didn't rest, didn't have a massage", "didn't rest, had a massage", "rested, didn't have a massage", "rested, had a massage"), col=cols, pch=16) segments(df[1:7,]$id, df[1:7,]$y, df[1:7,]$id, mean(df[1:7,]$y), col = col1) segments(df[8:14,]$id, df[8:14,]$y, df[8:14,]$id, mean(df[8:14,]$y), col = col2) segments(df[15:21,]$id, df[15:21,]$y, df[15:21,]$id, mean(df[15:21,]$y), col = col3) segments(df[22:28,]$id, df[22:28,]$y, df[22:28,]$id, mean(df[22:28,]$y), col = col4) } #### Proposed Model 3 - COMPARISON HISTOGRAM #### else if (rv3$getmodel3==3) { diff3 = SSEnull-SSEfull.3 int_x = matrix(c(SSEfull.3,diff3)) tx <- barplot(SSEnull, width=1, space=0, col=c("#337ab7"), ylab = "SSE",beside=FALSE,xlim=c(0,4), names.arg =(c("Null model \n error")), ylim=c(0, histTop)) int <- barplot(int_x,width=1,space = 1.5 ,col=c("red","#DCDCDC"),ylab = "SSE", beside = FALSE, add = TRUE,xlim=c(0,4), names.arg =(c("Error of model \n with interaction")), ylim=c(0, max(SSEnull))) text(x = int, y = SSEfull.3+diff3/2, "Error \n reduction") legend("topright", inset=.01, cex = 1, title="SSE", legend = round(c(SSEnull, SSEfull.3, diff3),2), horiz=FALSE, lty=c(1,1), lwd=c(2,2), col=c("#337ab7","red","#DCDCDC")) }}) #### COMPARISON - PLOT CAPTIONS #### output$Cap.Com <- renderText({ #### null #### if (rv$getmodel==1) { paste("This plot shows the finishing time for each participant with an addition of the line of best fit for the null model. Click to see how the residual error is reduced when a predictor is added to the model", sep = "") } #### proposed #### else if (rv$getmodel==2){ paste("This plot shows the finishing time for each participant with an addition of the line of best fit when a model with 'Resting the day before' is fitted. It is clear that additon of the predictor reduces the error", sep = "") } #### hist #### else if (rv$getmodel==3){ paste("This histogram shows the Sum Squared Errors (SSE) of the null model and the model with 'Resting the day before'as a predictor. It also shows how additon of the predicotr reduces the SSE.", sep = "") } }) # Proposed model 2 output$Cap.Com2 <- renderText({ #### null #### if (rv2$getmodel2==1) { paste("This plot shows the finishing time for each participant with an addition of the line of best fit for the null model. Click to see how the residual error is reduced when two predictors are added to the model", sep = "") } #### proposed #### else if (rv2$getmodel2==2){ paste("This plot shows the finishing time for each participant with an addition of the line of best fit when a model with two predictors is fitted. It is clear that additon of the predictor reduces the error.", sep = "") } #### hist #### else if (rv2$getmodel2==3){ paste("This histogram shows the Sum Squared Errors (SSE) of the null model, the model with one predictor: ",x1label," and the model with two predictors, ",xbothlabel," . It also shows how the additon of the predictors reduces the SSE.", sep = "") } }) ############################# #### ANOVA vs REGRESSION #### ############################# # #### PROPOSED MODEL 1 #### # output$R2equation_m1 <- renderUI({ #withMathJax("$$ { R^2=\\frac{SSmodel}{SStotal}=\\frac{",round(diff,0),"}{",round(SSEnull,0),"} # = ",round(R, 4),"}$$") }) output$tableAnova1 <- shiny::renderDataTable({ tanovay <- data.frame (aggregate (y ~ x1, data = df, FUN = mean)) tx1 <- c("No","Yes") tanova1<-cbind(tx1,round(tanovay$y,2)) colnames(tanova1) <- c (x1label, ylabel) tanova1 }, options = list(searching=F, pageLength=9, lengthChange=F, ordering=F, paging=F, info=F), escape=F) output$tableReg1 <- shiny::renderDataTable({ regcoef1 <- summary (lm (y~x1, data = df)) aa1 <- data.frame (round(regcoef1$coefficients [,1],2)) b1 <- c ("No", "Yes") c1 <- c ("", paste("(",round(mean(df[15:28,]$y),2)," - ", round(mean(df[0:14,]$y),2), ")", sep = "")) aa1 <- cbind (b1, aa1, c1) colnames(aa1) <- c (x1label, "Coefficients", "") aa1 }, options = list(searching=F, pageLength=9, lengthChange=F, ordering=F, paging=F, info=F), escape=F) output$anovaStats1 <- renderPrint({anova(lm(y ~ x1, df))}) output$regStats1 <- renderPrint({summary(lm(y ~ x1, df))}) #### PROPOSED MODEL 2 #### output$R2equation_m2 <- renderUI({ #withMathJax("$$ { R^2=\\frac{SSmodel}{SStotal}=\\frac{",round(diff,0),"}{",round(SSEnull,0),"} # = ",round(R, 4),"}$$") }) output$tableAnova2 <- shiny::renderDataTable({ tanovay <- data.frame (aggregate (y ~ x1 * x2, data = df, FUN = mean)) xtax1 <- c("None", "Rested only", "Massage only", "Both") tanova1<-cbind(xtax1,round(tanovay$y,2)) colnames(tanova1) <- c (xbothlabel, ylabel) tanova1 }, options = list(searching=F, pageLength=9, lengthChange=F, ordering=F, paging=F, info=F), escape=F) output$tableReg2 <- shiny::renderDataTable({ regcoef1 <- summary (lm (y ~ x1 * x2, data = df)) aa1 <- c(round(regcoef1$coefficients[1,1],2), round(regcoef1$coefficients[2,1],2)+round(regcoef1$coefficients[1,1],2), round(regcoef1$coefficients[3,1],2)+round(regcoef1$coefficients[1,1],2), round(regcoef1$coefficients[4,1],2)+round(regcoef1$coefficients[3,1],2)+round(regcoef1$coefficients[2,1],2)+round(regcoef1$coefficients[1,1],2) ) b1 <- c("None", "Rested only", "Massage only", "Both") c1 <- c ("", paste("(",round(regcoef1$coefficients[1,1],2)," + ", round(regcoef1$coefficients[2,1],2), ")", sep = ""), paste("(",round(regcoef1$coefficients[1,1],2)," - ", abs(round(regcoef1$coefficients[3,1],2)), ")", sep = ""), paste("(",round(regcoef1$coefficients[1,1],2)," + ", round(regcoef1$coefficients[2,1],2), " - ", abs(round(regcoef1$coefficients[3,1],2)), " + ", round(regcoef1$coefficients[4,1],2), ")", sep = "") ) aa1 <- cbind (b1, aa1, c1) colnames(aa1) <- c (xbothlabel, "Coefficients", "") aa1 }, options = list(searching=F, pageLength=9, lengthChange=F, ordering=F, paging=F, info=F), escape=F) output$anovaStats2 <- renderPrint({anova(lm(y ~ x1 * x2, df))}) output$regStats2 <- renderPrint({summary(lm(y ~ x1 * x2, df))}) ################################## #### REPORT YOUR RESULTS 1 #### ################################## # Function to add buttons to the report your results table shinyInput <- function(FUN, len, id, ...) { inputs <- character(len) for (i in seq_len(len)) { inputs[i] <- as.character(FUN(paste0(id, i), ...)) } inputs } # Table headers Term <- HTML('<a data-toggle="popover" data-trigger="focus" class="bootpop" title="Our parameters"<center>Term</center>') Estimate <- HTML('<a data-toggle="popover" data-trigger="focus" class="bootpop" title="Estimation for each parameter"<center>Estimate</center>') S_Error <- HTML('<a data-toggle="popover" data-trigger="focus" class="bootpop" title="Standard error for each estimated parameter"<center>Std. Error</center>') T_Value <- HTML('<a data-toggle="popover" data-trigger="focus" class="bootpop" title="Test statistics (ratio between systematic and unsystematic variance)"<center>t value</center>') P_Value <- HTML('<a data-toggle="popover" data-trigger="focus" class="bootpop" title="Probability of obtaining an equal or more extreme t value if H0 is true."<center>p-value</center>') #### REPORT YOUR RESULTS 1 TABLE #### labelsc <- c(Term, Estimate, S_Error, T_Value, P_Value) if (xmodel[1,5] > 0.0005) {pvalueb1 <- sprintf("%.2f",round(xmodel[1,5], 3))} else {pvalueb1 <- "< 0.001"} if (xmodel[2,5] > 0.001) {pvalueb2 <- sprintf("%.2f",round(xmodel[2,5], 3))} else {pvalueb2 <- "< 0.001"} row1 <- c(shinyInput(actionButton, 1, 'button_1', label = "Intercept", onclick = 'Shiny.onInputChange(\"select_button\", this.id)'), sprintf("%.2f",round(xmodel[1,2],2)), sprintf("%.2f",round(xmodel[1,3],2)), sprintf("%.2f",round(xmodel[1,4],2)), pvalueb1 ) row2 <- c(shinyInput(actionButton, 1, 'button_2', label = x1label, onclick = 'Shiny.onInputChange(\"select_button\", this.id)' ), sprintf("%.2f",round(xmodel[2,2],2)), sprintf("%.2f",round(xmodel[2,3],2)), sprintf("%.2f",round(xmodel[2,4],2)), pvalueb2 ) tablecof <- rbind(row1, row2) colnames(tablecof) <- labelsc rownames(tablecof) <- NULL # Buttons to reveal table buttonRV <- reactiveValues(RV = NULL, RV2 = NULL, RV3 = NULL) observeEvent(input$showTstat, {buttonRV$RV <- TRUE}) # Introductory text for the coeff table output$introCoeff <- renderUI({ if (is.null(buttonRV$RV)) return() HTML('Below is the regression summary table: ') }) # Generate table if "Press to see the relevant test statistics" button is pressed output$tableTS <- DT::renderDataTable({ if (is.null(buttonRV$RV)) return() tablecof} , class = 'compact' , options = list(searching=F, pageLength=10, lengthChange=F, ordering=F, paging=F, info=F) , server=F, escape=F, selection='none') # Initiate the variables that the buttons will change inferenceV <- reactiveValues(betas = '') # Observe the button clicking within the table observeEvent(input$select_button, { selectedRow <- as.numeric(strsplit(input$select_button, "_")[[1]][2]) if (selectedRow == 11) {inferenceV$betas <- "b0"} else if (selectedRow == 21) {inferenceV$betas <- "b1"} }) # Prepare statitics for the text f <- summary(xmod)$fstatistic Radj <- summary(xmod)$adj.r.squared R <- summary(xmod)$r.squared pF <- round(pf(q=f[1], df1=f[2], df2=f[3], lower.tail=FALSE)) if (pF < 0.0001) { pF <- "< 0.0001" } # R^2 text output$R2 <- renderUI({ if (is.null(buttonRV$RV)) return() HTML( paste("Multiple R-squared: ", round(R, 4), ", Adjusted R-squared: ", round(Radj, 4), sep=""), " ", paste("F-statistic: ", round(f[1],4), " on ", round(f[2],4), " and ", round(f[3],4), " DF, p-value: ", pF, sep ="") ) }) #### REPORT YOUR RESULTS 1 SCATTER PLOT #### output$betaplot <- renderPlot({ #### When b0 is selected #### if (inferenceV$betas =="b0"){ plot(data.frame(df$id , df$y), col=c("red", "green")[dfToDisplay$x1l], xlab = idlabel, ylab = ylabel, pch= 16, ylim=c(botLimit, topLimit), xlim=c(1, 28)) segments(1, mean(df[0:14,]$y), 14, mean(df[0:14,]$y), col="red") segments(15, mean(df[15:28,]$y), 28, mean(df[15:28,]$y), col="green") legend(x="bottomright", legend = c("didn't rest", "rested"), col=c("red","green"), pch=16) segments(df[0:14,]$id, df[0:14,]$y, df[0:14,]$id, mean(df[0:14,]$y), col = "red") segments(df[15:28,]$id, df[15:28,]$y, df[15:28,]$id, mean(df[15:28,]$y), col = "green") #intercept segments(0, mean(df[0:14,]$y), 1, mean(df[0:14,]$y), col="blue") segments(14, mean(df[0:14,]$y), 17, mean(df[0:14,]$y), col="blue") segments(16.5, (mean(df[0:14,]$y)+10), 17, mean(df[0:14,]$y), col="blue") segments(16.5, (mean(df[0:14,]$y)-10), 17, mean(df[0:14,]$y), col="blue") text(17.5, mean(df[0:14,]$y), paste('Intercept (b0) =', round(xmodel[1,2],2)), pos = 4, col="blue") } #### When b1 is selected #### else if (inferenceV$betas == "b1"){ plot(data.frame(df$id , df$y), col=c("red", "green")[dfToDisplay$x1l], xlab = idlabel, ylab = ylabel, pch= 16, ylim=c(botLimit, topLimit), xlim=c(1, 28)) segments(1, mean(df[0:14,]$y), 14, mean(df[0:14,]$y), col="red") segments(15, mean(df[15:28,]$y), 28, mean(df[15:28,]$y), col="green") legend(x="bottomright", legend = c("didn't rest", "rested"), col=c("red","green"), pch=16) segments(df[0:14,]$id, df[0:14,]$y, df[0:14,]$id, mean(df[0:14,]$y), col = "red") segments(df[15:28,]$id, df[15:28,]$y, df[15:28,]$id, mean(df[15:28,]$y), col = "green") #b1 segments(14, mean(df[0:14,]$y), 15, mean(df[0:14,]$y), col="blue") segments(14, mean(df[15:28,]$y), 15, mean(df[15:28,]$y), col="blue") segments(14.5, mean(df[0:14,]$y), 14.5, mean(df[15:28,]$y), col="blue") text(14.5, (mean(df[0:14,]$y)+(round(xmodel[2,2],2)/2)), paste('b1 =', round(xmodel[2,2],2)), pos = 4, col="blue") } }) #### REPORT YOUR RESULTS 1 T PLOT #### # First, a color the area function colorArea <- function(from, to, density, ..., col="blue", dens=NULL){ y_seq <- seq(from, to, length.out=500) d <- c(0, density(y_seq, ...), 0) polygon(c(from, y_seq, to), d, col=col, density=dens) } # Then the t-plot output$tplot <- renderPlot({ #### When t0 is selected #### if (inferenceV$betas =="b0"){ ##Prints t distribution plot curve(dt(x, df=dft), from=-12, to=12, main="t value", xlab = "T-student's distribution", ylab = 'Probability Density') abline(v=b0_t_statistics,col="blue") abline(v=(qt(c(.975), df=dft)),col="red") abline(v=-(qt(c(.975), df=dft)),col="red") text(b0_t_statistics,0.2, paste('t value =', round(b0_t_statistics,2)), pos= 2, col="blue") text(7,0.1, paste('critical value = [', round(qt(c(.975), df=dft),2), ",", round(-qt(c(.975), df=dft),2), "]"), pos=3, offset=2, col="red") colorArea(from=qt(c(.975), df=dft), to=3.75, dnorm, mean=0, sd=1, col=2, dens=65) colorArea(from= qt(c(.025), df=dft), to= -3.75, dnorm, mean=0, sd=1, col=2, dens=65) } #### When t1 is selected #### else if (inferenceV$betas =="b1"){ curve(dt(x, df=dft), from=-12, to=12, main="t value", xlab = "T-student's distribution", ylab = 'Probability Density') abline(v=b1_t_statistics,col="chartreuse4") abline(v=(qt(c(.975), df=dft)),col="red") abline(v=-(qt(c(.975), df=dft)),col="red") text((b1_t_statistics),0.2, paste('t value =', round(b1_t_statistics,2)), pos=4, col="chartreuse4") text(7,0.1, paste('critical value = [', round(qt(c(.975), df=dft),2), ",", round(-qt(c(.975), df=dft),2), "]"), pos=3, offset=2, col="red") colorArea(from=qt(c(.975), df=dft), to=3.75, dnorm, mean=0, sd=1, col=2, dens=65) colorArea(from= qt(c(.025), df=dft), to= -3.75, dnorm, mean=0, sd=1, col=2, dens=65) } }) #### REPORT YOUR RESULTS 1 PLOT CAPTIONS #### output$plotsLabel <- renderUI({ if (inferenceV$betas =="b0"){ HTML(paste(" The effect here is ", round(xmodel[1,2],2), " and represents the marathon finishing time of someone who hasn\'t ",x1label,". The error for the intercept is ", round(xmodel[1,3],2), ". In this sense the test statistics is ", round(xmodel[1,2],2), " / ", round(xmodel[1,3],2), " = ", round(xmodel[1,4],2), ". The likelihood to have a test statistics of ", round(xmodel[1,4],2), " or a more extreme test statistic if H0 (i.e., the value of the intercept isn't different from zero) is true is is ", pvalueb1, ". ", sep="")) } else if (inferenceV$betas =="b1"){ HTML(paste(" The effect here is ", round(xmodel[2,2],2), " and represents the impact of having ", x1label," on ",ylabel," (i.e., if ",x1label,", how many minutes earlier should you expect to finish?). The error for the intercept is ", round(xmodel[2,3],2), ". In this sense the test statistics is ", round(xmodel[2,2],2), " / ", round(xmodel[2,3],2), " = ", round(xmodel[2,4],2), ". The likelihood to have a test statistics of ", round(xmodel[2,4],2), " or a more extreme test statistic if H0 (i.e., the value of the slope isn't different from zero) is true is ", pvalueb2, ". ", sep="")) } }) #### REPORTING RESULTS 1 TEXT BOX #### output$reporting1 <- renderUI ({ HTML(paste("The unstandardized coefficient way: The results show a significant negative effective of ",x1label," on ",ylabel,", b = ", round(xmodel[2,2],2), ", t(", f[3], ") = ", round(xmodel[2,4],2), ", p = .001.", sep="")) }) betaST <- round(summary(lm(scale(y) ~ scale(x1)))$coefficients[[2]], 2) tforcompare <- round(summary(lm(scale(y) ~ scale(x1)))$coefficients[2,3], 2) output$reporting2 <- renderUI ({ HTML(paste("The standardized coefficient way: The results show a significant negative effective of hours of ",x1label," on ",ylabel," β = ", betaST, ", t(", f[3], ") = ", round(xmodel[2,4],2), ", p = .001.", sep="")) }) # Let's show R2 output$RtoPrint <- renderUI({ HTML(paste('In this case there is no need for that because the test for the effect size is the same as the test for the b1 as there is only one quantitative independent variable (hours of training). Don\'t believe us? Here is the proof: Note that the F distribution is simply a squared t distribution. So, if all the effect size is due to b1, the square t value for b1 should be the same as the F value! Do the math and see for yourself (a little discrepancy is to expected due to rounding):', "<ul><li>t value of b1 = ", tforcompare, "</li><li>F value of R2 = ", round(f[1],2), "</li>", sep="")) }) output$regressionStats <- renderPrint({summary(lm(y ~ x1))}) ############################### #### REPORT YOUR RESULTS 2 #### ############################### #### REPORT YOUR RESULTS 2 TABLE #### # Mannually create the content of the table so we can format every element labelsc2 <- c(Term, Estimate, S_Error, T_Value, P_Value) if (xmodel.2[1,5] > 0.0005) {pvalueb1.2 <- sprintf("%.2f",round(xmodel.2[1,5], 3))} else {pvalueb1.2 <- "< 0.001"} if (xmodel.2[2,5] > 0.001) {pvalueb2.2 <- sprintf("%.2f",round(xmodel.2[2,5], 3))} else {pvalueb2.2 <- "< 0.001"} if (xmodel.2[3,5] > 0.001) {pvalueb3.2 <- sprintf("%.2f",round(xmodel.2[3,5], 3))} else {pvalueb3.2 <- "< 0.001"} row1.2 <- c(shinyInput(actionButton, 1, 'button_1', label = "Intercept", onclick = 'Shiny.onInputChange(\"select_button2\", this.id)'), sprintf("%.2f",round(xmodel.2[1,2],2)), sprintf("%.2f",round(xmodel.2[1,3],2)), sprintf("%.2f",round(xmodel.2[1,4],2)), pvalueb1.2 ) row2.2 <- c(shinyInput(actionButton, 1, 'button_2', label = x1label, onclick = 'Shiny.onInputChange(\"select_button2\", this.id)' ), sprintf("%.2f",round(xmodel.2[2,2],2)), sprintf("%.2f",round(xmodel.2[2,3],2)), sprintf("%.2f",round(xmodel.2[2,4],2)), pvalueb2.2 ) row3.2 <- c(shinyInput(actionButton, 1, 'button_3', label = x2label, onclick = 'Shiny.onInputChange(\"select_button2\", this.id)' ), sprintf("%.2f",round(xmodel.2[3,2],2)), sprintf("%.2f",round(xmodel.2[3,3],2)), sprintf("%.2f",round(xmodel.2[3,4],2)), pvalueb3.2 ) tablecof2 <- rbind(row1.2, row2.2, row3.2) colnames(tablecof2) <- labelsc2 rownames(tablecof2) <- NULL # Click to show table button observeEvent(input$showTstat2, {buttonRV$RV2 <- TRUE}) # Introductory text for the Coeff table output$introCoeff2 <- renderUI({ if (is.null(buttonRV$RV2)) return() HTML('Below is the regression summary table: ') }) # Conditional to show table on button press output$tableTS2 <- DT::renderDataTable({ if (is.null(buttonRV$RV2)) return() tablecof2} , class = 'compact' , options = list(searching=F, pageLength=10, lengthChange=F, ordering=F, paging=F, info=F) , server=F, escape=F, selection='none') # Initiate the variables that the buttons will change inferenceV2 <- reactiveValues(betas = '') # Observe the button clicking within the table observeEvent(input$select_button2, { selectedRow2 <- as.numeric(strsplit(input$select_button2, "_")[[1]][2]) if (selectedRow2 == 11) {inferenceV2$betas <- "b0"} else if (selectedRow2 == 21) {inferenceV2$betas <- "b1"} else if (selectedRow2 == 31) {inferenceV2$betas <- "b2"} }) # Prepare output f.2 <- summary(xmod.2)$fstatistic Radj.2 <- summary(xmod.2)$adj.r.squared R.2 <- summary(xmod.2)$r.squared pF.2 <- round(pf(q=f.2[1], df1=f.2[2], df2=f.2[3], lower.tail=FALSE)) if (pF.2 < 0.0001) {pF.2 <- "< 0.0001"} # Let's compile the R^2 text output$R2.2 <- renderUI({ if (is.null(buttonRV$RV2)) return() HTML( paste("Multiple R-squared: ", round(R.2, 4), ", Adjusted R-squared: ", round(Radj.2, 4), sep=""), " ", paste("F-statistic: ", round(f.2[1],4), " on ", round(f.2[2],4), " and ", round(f.2[3],4), " DF, p-value: ", pF.2, sep ="") ) }) #### REPORT YOUR RESULTS 2 SCATTER PLOT #### output$betaplot2 <- renderPlot({ #### When b0 is selected #### if (inferenceV2$betas =="b0"){ indexP <- c(rep(1, 7), rep(2, 7), rep(3, 7), rep(4,7)) plot(data.frame(df$id , df$y), col=c(col1, col2, col3, col4)[indexP], xlab = idlabel, ylab = ylabel, pch= 16, ylim=c(botLimit, topLimit), xlim=c(1, 28)) segments(1, expM1.2, 7, expM1.2, col=col1) segments(8, expM2.2, 14, expM2.2, col=col2) segments(15, expM3.2, 21, expM3.2, col=col3) segments(22, expM4.2, 28, expM4.2, col=col4) legend(x="bottomright", legend = c("didn't rest, didn't have a massage", "didn't rest, had a massage", "rested, didn't have a massage", "rested, had a massage"), col=c(col1, col2, col3, col4), pch=16) segments(df[1:7,]$id, df[1:7,]$y, df[1:7,]$id, expM1.2, col = col1) segments(df[8:14,]$id, df[8:14,]$y, df[8:14,]$id, expM2.2, col = col2) segments(df[15:21,]$id, df[15:21,]$y, df[15:21,]$id, expM3.2, col = col3) segments(df[22:28,]$id, df[22:28,]$y, df[22:28,]$id, expM4.2, col = col4) #intercept segments(0, expM1.2, 1, expM1.2, col="blue") segments(7, expM1.2, 9, expM1.2, col="blue") segments(8, (expM1.2+10), 9, expM1.2, col="blue") segments(8, (expM1.2-10), 9, expM1.2, col="blue") text(9, expM1.2, paste('Intercept (b0) =', round(xmodel.2[1,2],2)), pos = 4, col="blue") } #### When b1 is selected #### else if (inferenceV2$betas == "b1"){ indexP <- c(rep(1, 7), rep(2, 7), rep(3, 7), rep(4,7)) plot(data.frame(df$id , df$y), col=c(col1, col2, col3, col4)[indexP], xlab = idlabel, ylab = ylabel, pch= 16, ylim=c(botLimit, topLimit), xlim=c(1, 28)) segments(1, expM1.2, 7, expM1.2, col=col1) segments(8, expM2.2, 14, expM2.2, col=col2) segments(15, expM3.2, 21, expM3.2, col=col3) segments(22, expM4.2, 28, expM4.2, col=col4) legend(x="bottomright", legend = c("didn't rest, didn't have a massage", "didn't rest, had a massage", "rested, didn't have a massage", "rested, had a massage"), col=c(col1, col2, col3, col4), pch=16) segments(df[1:7,]$id, df[1:7,]$y, df[1:7,]$id, expM1.2, col = col1) segments(df[8:14,]$id, df[8:14,]$y, df[8:14,]$id, expM2.2, col = col2) segments(df[15:21,]$id, df[15:21,]$y, df[15:21,]$id, expM3.2, col = col3) segments(df[22:28,]$id, df[22:28,]$y, df[22:28,]$id, expM4.2, col = col4) #b1 segments(7, expM1.2, 8, expM1.2, col="blue") segments(7, expM3.2, 15, expM3.2, col="blue") segments(7.5, expM1.2, 7.5, expM3.2, col="blue") text(7.5, (xmodel.2[1,2]+(xmodel.2[2,2]/2)), paste('b1 =', round(xmodel.2[2,2],2)), pos = 4, col="blue") } else if (inferenceV2$betas == "b2"){ indexP <- c(rep(1, 7), rep(2, 7), rep(3, 7), rep(4,7)) plot(data.frame(df$id , df$y), col=c(col1, col2, col3, col4)[indexP], xlab = idlabel, ylab = ylabel, pch= 16, ylim=c(botLimit, topLimit), xlim=c(1, 28)) segments(1, expM1.2, 7, expM1.2, col=col1) segments(8, expM2.2, 14, expM2.2, col=col2) segments(15, expM3.2, 21, expM3.2, col=col3) segments(22, expM4.2, 28, expM4.2, col=col4) legend(x="bottomright", legend = c("didn't rest, didn't have a massage", "didn't rest, had a massage", "rested, didn't have a massage", "rested, had a massage"), col=c(col1, col2, col3, col4), pch=16) segments(df[1:7,]$id, df[1:7,]$y, df[1:7,]$id, expM1.2, col = col1) segments(df[8:14,]$id, df[8:14,]$y, df[8:14,]$id, expM2.2, col = col2) segments(df[15:21,]$id, df[15:21,]$y, df[15:21,]$id, expM3.2, col = col3) segments(df[22:28,]$id, df[22:28,]$y, df[22:28,]$id, expM4.2, col = col4) #b2 segments(7, expM1.2, 8, expM1.2, col="blue") segments(7, expM2.2, 8, expM2.2, col="blue") segments(7.5, expM1.2, 7.5, expM2.2, col="blue") text(7.5, (xmodel.2[1,2]+(xmodel.2[3,2]/2)), paste('b2 =', round(xmodel.2[3,2],2)), pos = 4, col="blue") } }) #### REPORT YOUR RESULTS 2 T PLOT #### output$tplot2 <- renderPlot({ #### When t0 is selected #### if (inferenceV2$betas =="b0"){ ##Prints t distribution plot curve(dt(x, df=dft), from=-12, to=12, main="t value", xlab = "T-student's distribution", ylab = 'Probability Density') abline(v=b0_t_statistics.2,col="blue") abline(v=(qt(c(.975), df=dft)),col="red") abline(v=-(qt(c(.975), df=dft)),col="red") text(11,0.2, paste('t value =', round(b0_t_statistics.2,2)), pos= 2, col="blue") segments(11, 0.2, 12.5, 0.2, col="blue") segments(12.5, 0.2, 12.25, 0.21, col="blue") segments(12.5, 0.2, 12.25, 0.19, col="blue") text(7,0.1, paste('critical value = [', round(qt(c(.975), df=dft),2), ",", round(-qt(c(.975), df=dft),2), "]"), pos=3, offset=2, col="red") colorArea(from=qt(c(.975), df=dft), to=3.75, dnorm, mean=0, sd=1, col=2, dens=65) colorArea(from= qt(c(.025), df=dft), to= -3.75, dnorm, mean=0, sd=1, col=2, dens=65) } #### When t1 is selected #### else if (inferenceV2$betas =="b1"){ curve(dt(x, df=dft), from=-12, to=12, main="t value", xlab = "T-student's distribution", ylab = 'Probability Density') abline(v=b1_t_statistics.2,col="chartreuse4") abline(v=(qt(c(.975), df=dft)),col="red") abline(v=-(qt(c(.975), df=dft)),col="red") text((b1_t_statistics.2),0.2, paste('t value =', round(b1_t_statistics.2,2)), pos=4, col="chartreuse4") text(7,0.1, paste('critical value = [', round(qt(c(.975), df=dft),2), ",", round(-qt(c(.975), df=dft),2), "]"), pos=3, offset=2, col="red") colorArea(from=qt(c(.975), df=dft), to=3.75, dnorm, mean=0, sd=1, col=2, dens=65) colorArea(from= qt(c(.025), df=dft), to= -3.75, dnorm, mean=0, sd=1, col=2, dens=65) } #### When t2 is selected #### else if (inferenceV2$betas =="b2"){ curve(dt(x, df=dft), from=-12, to=12, main="t value", xlab = "T-student's distribution", ylab = 'Probability Density') abline(v=b2_t_statistics.2,col="chartreuse4") abline(v=(qt(c(.975), df=dft)),col="red") abline(v=-(qt(c(.975), df=dft)),col="red") text((b2_t_statistics.2),0.2, paste('t value =', round(b2_t_statistics.2,2)), pos=2, col="chartreuse4") text(7,0.1, paste('critical value = [', round(qt(c(.975), df=dft),2), ",", round(-qt(c(.975), df=dft),2), "]"), pos=3, offset=2, col="red") colorArea(from=qt(c(.975), df=dft), to=3.75, dnorm, mean=0, sd=1, col=2, dens=65) colorArea(from= qt(c(.025), df=dft), to= -3.75, dnorm, mean=0, sd=1, col=2, dens=65) } }) #### REPORT YOUR RESULTS 2 PLOT CAPTIONS #### output$plotsLabel2 <- renderUI({ if (inferenceV2$betas =="b0"){ HTML(paste(" The effect here is ", round(xmodel.2[1,2],2), " and represents the expected marathon finishing time for someone who hasn\'t rested nor had a massage. The error for the intercept is ", round(xmodel.2[1,3],2), ". In this sense the test statistics is ", round(xmodel.2[1,2],2), " / ", round(xmodel.2[1,3],2), " = ", round(xmodel.2[1,4],2), ". The likelihood to have a test statistics of ", round(xmodel.2[1,4],2), " or a more extreme test statistic, if H0 (i.e., the value of the intercept isn't different from zero) is true, is ", pvalueb1.2, ". Note: Because the test statistic is so great, it cannot be represented in the t-distribution plot below. ", sep="")) } else if (inferenceV2$betas =="b1"){ HTML(paste(" The effect here is ", round(xmodel.2[2,2],2), " and represents the impact of having rested the day before on finishing time, controlling for the effect of having a massage. In other words, it is the expected difference in finishing time between someone who has neither rested nor had a massage and someone who has rested, but did not have a massage. The error for this difference is ", round(xmodel.2[2,3],2), ". In this sense the test statistics is ", round(xmodel.2[2,2],2), " / ", round(xmodel.2[2,3],2), " = ", round(xmodel.2[2,4],2), ". The likelihood to have a test statistics of ", round(xmodel.2[2,4],2), " or a more extreme test statistic, if H0 (i.e., the value of the difference between these groups isn't different from zero) is true, is ", pvalueb2.2, ". ", sep="")) } else if (inferenceV2$betas =="b2"){ HTML(paste(" The effect here is ", round(xmodel.2[3,2],2), " and represents the impact of having had a massage the day before on finishing time, controlling for the effect of having rested the day before. In other words, it is the expected difference in finishing time between someone who has neither rested nor had a massage and someone who has had a massage, but did not rest. The error for this difference is ", round(xmodel.2[3,3],2), ". In this sense the test statistics is ", round(xmodel.2[3,2],2), " / ", round(xmodel.2[3,3],2), " = ", round(xmodel.2[3,4],2), ". The likelihood to have a test statistics of ", round(xmodel.2[3,4],2), " or a more extreme test statistic, if H0 (i.e., the value of the difference between these groups isn't different from zero) is true, is ", pvalueb3.2, ". ", sep="")) } }) #### REPORT YOUR RESULTS 2 TEXT BOX #### output$reporting1_2 <- renderUI ({ HTML(paste("The unstandardized coefficient way: The results show that the independent variables included in the model explain ", 100 * round(R.2, 4), "% of the variation in ", tolower(ylabel), ". ", x1label, " had a significant effect on ", tolower(ylabel), ", b = ", round(xmodel.2[2,2],2), ", t(", f.2[3], ") = ", round(xmodel.2[2,4],2), ", p ", pvalueb2.2, ". ", x2label, " also had a significant effect on ", tolower(ylabel), ", b = ", round(xmodel.2[3,2],2), ", t(", f.2[3], ") = ", round(xmodel.2[3,4],2), ", p ", pvalueb3.2, ". ", sep="")) }) betaST2.1 <- round(summary(lm(scale(y) ~ scale(x1) + scale(x2)))$coefficients[[2]], 2) betaST2.2 <- round(summary(lm(scale(y) ~ scale(x1) + scale(x2)))$coefficients[[3]], 2) output$reporting2_2 <- renderUI ({ HTML(paste("The standardized coefficient way: The results show that the independent variables included in the model explain ", 100 * round(R.2, 4), "% of the variation in ", tolower(ylabel), ". ", x1label, " had a significant effect on ", tolower(ylabel), ", β = ", betaST2.1, ", t(", f.2[3], ") = ", round(xmodel.2[2,4],2), ", p < .001. ", x2label, " also had a significant effect on ", tolower(ylabel), ", β = ", betaST2.2, ", t(", f.2[3], ") = ", round(xmodel.2[3,4],2), ", p < .001. ", sep="")) }) output$regressionStats2 <- renderPrint({ summary(lm(y ~ x1 + x2)) }) ############################### #### REPORT YOUR RESULTS 3 #### ############################### #### REPORT YOUR RESULTS 3 TABLE #### # Mannually create the content of the table so we can format every element combined <- round(xmodel.3[4,2]+xmodel.3[3,2]+xmodel.3[2,2],2) labelsc3 <- c(Term, Estimate, S_Error, T_Value, P_Value) if (xmodel.3[1,5] > 0.0005) {pvalueb1.3 <- sprintf("%.2f",round(xmodel.3[1,5], 3))} else {pvalueb1.3 <- "< 0.001"} if (xmodel.3[2,5] > 0.001) {pvalueb2.3 <- sprintf("%.2f",round(xmodel.3[2,5], 3))} else {pvalueb2.3 <- "< 0.001"} if (xmodel.3[3,5] > 0.001) {pvalueb3.3 <- sprintf("%.2f",round(xmodel.3[3,5], 3))} else {pvalueb3.3 <- "< 0.001"} if (xmodel.3[4,5] > 0.001) {pvalueb4.3 <- sprintf("%.2f",round(xmodel.3[4,5], 3))} else {pvalueb4.3 <- "< 0.001"} row1.3 <- c(shinyInput(actionButton, 1, 'button_1', label = "Intercept", onclick = 'Shiny.onInputChange(\"select_button3\", this.id)'), sprintf("%.2f",round(xmodel.3[1,2],2)), sprintf("%.2f",round(xmodel.3[1,3],2)), sprintf("%.2f",round(xmodel.3[1,4],2)), pvalueb1.3 ) row2.3 <- c(shinyInput(actionButton, 1, 'button_2', label = x1label, onclick = 'Shiny.onInputChange(\"select_button3\", this.id)' ), sprintf("%.2f",round(xmodel.3[2,2],2)), sprintf("%.2f",round(xmodel.3[2,3],2)), sprintf("%.2f",round(xmodel.3[2,4],2)), pvalueb2.3 ) row3.3 <- c(shinyInput(actionButton, 1, 'button_3', label = x2label, onclick = 'Shiny.onInputChange(\"select_button3\", this.id)' ), sprintf("%.2f",round(xmodel.3[3,2],2)), sprintf("%.2f",round(xmodel.3[3,3],2)), sprintf("%.2f",round(xmodel.3[3,4],2)), pvalueb3.3 ) row4.3 <- c(shinyInput(actionButton, 1, 'button_4', label = paste(xbothlabel, "interaction"), onclick = 'Shiny.onInputChange(\"select_button3\", this.id)' ), sprintf("%.2f",round(xmodel.3[4,2],2)), sprintf("%.2f",round(xmodel.3[4,3],2)), sprintf("%.2f",round(xmodel.3[4,4],2)), pvalueb4.3 ) tablecof3 <- rbind(row1.3, row2.3, row3.3, row4.3) colnames(tablecof3) <- labelsc3 rownames(tablecof3) <- NULL # Click to show table button observeEvent(input$showTstat3, {buttonRV$RV3 <- TRUE}) # Introductory text for the Coeff table output$introCoeff3 <- renderUI({ if (is.null(buttonRV$RV3)) return() HTML('Below is the regression summary table: ') }) # Conditional to show table on button press output$tableTS3 <- DT::renderDataTable({ if (is.null(buttonRV$RV3)) return() tablecof3} , class = 'compact' , options = list(searching=F, pageLength=10, lengthChange=F, ordering=F, paging=F, info=F) , server=F, escape=F, selection='none') # Initiate the variables that the buttons will change inferenceV3 <- reactiveValues(betas = '') # Observe the button clicking within the table observeEvent(input$select_button3, { selectedRow3 <- as.numeric(strsplit(input$select_button3, "_")[[1]][2]) if (selectedRow3 == 11) {inferenceV3$betas <- "b0"} else if (selectedRow3 == 21) {inferenceV3$betas <- "b1"} else if (selectedRow3 == 31) {inferenceV3$betas <- "b2"} else if (selectedRow3 == 41) {inferenceV3$betas <- "b3"} }) # Prepare output f.3 <- summary(xmod.3)$fstatistic Radj.3 <- summary(xmod.3)$adj.r.squared R.3 <- summary(xmod.3)$r.squared pF.3 <- round(pf(q=f.3[1], df1=f.3[2], df2=f.3[3], lower.tail=FALSE)) if (pF.3 < 0.0001) {pF.3 <- "< 0.0001"} # Let's compile the R^2 text output$R2.3 <- renderUI({ if (is.null(buttonRV$RV3)) return() HTML( paste("Multiple R-squared: ", round(R.3, 4), ", Adjusted R-squared: ", round(Radj.3, 4), sep=""), " ", paste("F-statistic: ", round(f.3[1],4), " on ", round(f.3[2],4), " and ", round(f.3[3],4), " DF, p-value: ", pF.3, sep ="") ) }) #### REPORT YOUR RESULTS 3 SCATTER PLOT #### output$betaplot3 <- renderPlot({ #### When b0 is selected #### if (inferenceV3$betas =="b0"){ indexP <- c(rep(1, 7), rep(2, 7), rep(3, 7), rep(4,7)) plot(data.frame(df$id , df$y), col=c(col1, col2, col3, col4)[indexP], xlab = idlabel, ylab = ylabel, pch= 16, ylim=c(botLimit, topLimit), xlim=c(1, 28)) segments(1, mean(df[1:7,]$y), 7, mean(df[1:7,]$y), col=col1) segments(8, mean(df[8:14,]$y), 14, mean(df[8:14,]$y), col=col2) segments(15, mean(df[15:21,]$y), 21, mean(df[15:21,]$y), col=col3) segments(22, mean(df[22:28,]$y), 28, mean(df[22:28,]$y), col=col4) legend(x="bottomright", legend = c("didn't rest, didn't have a massage", "didn't rest, had a massage", "rested, didn't have a massage", "rested, had a massage"), col=c(col1, col2, col3, col4), pch=16) segments(df[1:7,]$id, df[1:7,]$y, df[1:7,]$id, mean(df[1:7,]$y), col = col1) segments(df[8:14,]$id, df[8:14,]$y, df[8:14,]$id, mean(df[8:14,]$y), col = col2) segments(df[15:21,]$id, df[15:21,]$y, df[15:21,]$id, mean(df[15:21,]$y), col = col3) segments(df[22:28,]$id, df[22:28,]$y, df[22:28,]$id, mean(df[22:28,]$y), col = col4) #intercept segments(0, mean(df[0:7,]$y), 1, mean(df[0:7,]$y), col="blue") segments(7, mean(df[0:7,]$y), 9, mean(df[0:7,]$y), col="blue") segments(8, (mean(df[0:7,]$y)+10), 9, mean(df[0:7,]$y), col="blue") segments(8, (mean(df[0:7,]$y)-10), 9, mean(df[0:7,]$y), col="blue") text(9, mean(df[0:7,]$y), paste('Intercept (b0) =', round(xmodel[1,2],2)), pos = 4, col="blue") } #### When b1 is selected #### else if (inferenceV3$betas == "b1"){ indexP <- c(rep(1, 7), rep(2, 7), rep(3, 7), rep(4,7)) plot(data.frame(df$id , df$y), col=c(col1, col2, col3, col4)[indexP], xlab = idlabel, ylab = ylabel, pch= 16, ylim=c(botLimit, topLimit), xlim=c(1, 28)) segments(1, mean(df[1:7,]$y), 7, mean(df[1:7,]$y), col=col1) segments(8, mean(df[8:14,]$y), 14, mean(df[8:14,]$y), col=col2) segments(15, mean(df[15:21,]$y), 21, mean(df[15:21,]$y), col=col3) segments(22, mean(df[22:28,]$y), 28, mean(df[22:28,]$y), col=col4) legend(x="bottomright", legend = c("didn't rest, didn't have a massage", "didn't rest, had a massage", "rested, didn't have a massage", "rested, had a massage"), col=c(col1, col2, col3, col4), pch=16) segments(df[1:7,]$id, df[1:7,]$y, df[1:7,]$id, mean(df[1:7,]$y), col = col1) segments(df[8:14,]$id, df[8:14,]$y, df[8:14,]$id, mean(df[8:14,]$y), col = col2) segments(df[15:21,]$id, df[15:21,]$y, df[15:21,]$id, mean(df[15:21,]$y), col = col3) segments(df[22:28,]$id, df[22:28,]$y, df[22:28,]$id, mean(df[22:28,]$y), col = col4) #b1 segments(7, mean(df[0:7,]$y), 8, mean(df[0:7,]$y), col="blue") segments(7, mean(df[15:21,]$y), 15, mean(df[15:21,]$y), col="blue") segments(7.5, mean(df[0:7,]$y), 7.5, mean(df[15:21,]$y), col="blue") text(7.5, (xmodel.2[1,2]+(xmodel.2[2,2]/2)), paste('b1 =', round(xmodel.2[2,2],2)), pos = 4, col="blue") } #### When b2 is selected #### else if (inferenceV3$betas == "b2"){ indexP <- c(rep(1, 7), rep(2, 7), rep(3, 7), rep(4,7)) plot(data.frame(df$id , df$y), col=c(col1, col2, col3, col4)[indexP], xlab = idlabel, ylab = ylabel, pch= 16, ylim=c(botLimit, topLimit), xlim=c(1, 28)) segments(1, mean(df[1:7,]$y), 7, mean(df[1:7,]$y), col=col1) segments(8, mean(df[8:14,]$y), 14, mean(df[8:14,]$y), col=col2) segments(15, mean(df[15:21,]$y), 21, mean(df[15:21,]$y), col=col3) segments(22, mean(df[22:28,]$y), 28, mean(df[22:28,]$y), col=col4) legend(x="bottomright", legend = c("didn't rest, didn't have a massage", "didn't rest, had a massage", "rested, didn't have a massage", "rested, had a massage"), col=c(col1, col2, col3, col4), pch=16) segments(df[1:7,]$id, df[1:7,]$y, df[1:7,]$id, mean(df[1:7,]$y), col = col1) segments(df[8:14,]$id, df[8:14,]$y, df[8:14,]$id, mean(df[8:14,]$y), col = col2) segments(df[15:21,]$id, df[15:21,]$y, df[15:21,]$id, mean(df[15:21,]$y), col = col3) segments(df[22:28,]$id, df[22:28,]$y, df[22:28,]$id, mean(df[22:28,]$y), col = col4) #b2 segments(7, mean(df[0:7,]$y), 8, mean(df[0:7,]$y), col="blue") segments(7, mean(df[8:14,]$y), 8, mean(df[8:14,]$y), col="blue") segments(7.5, mean(df[0:7,]$y), 7.5, mean(df[8:14,]$y), col="blue") text(7.5, (xmodel.3[1,2]+(xmodel.3[3,2]/2)), paste('b2 =', round(xmodel.3[3,2],2)), pos = 4, col="blue") } #### When b3 is selected #### else if (inferenceV3$betas == "b3"){ indexP <- c(rep(1, 7), rep(2, 7), rep(3, 7), rep(4,7)) plot(data.frame(df$id , df$y), col=c(col1, col2, col3, col4)[indexP], xlab = idlabel, ylab = ylabel, pch= 16, ylim=c(botLimit, topLimit), xlim=c(1, 28)) segments(1, mean(df[1:7,]$y), 7, mean(df[1:7,]$y), col=col1) segments(8, mean(df[8:14,]$y), 14, mean(df[8:14,]$y), col=col2) segments(15, mean(df[15:21,]$y), 21, mean(df[15:21,]$y), col=col3) segments(22, mean(df[22:28,]$y), 28, mean(df[22:28,]$y), col=col4) legend(x="bottomright", legend = c("didn't rest, didn't have a massage", "didn't rest, had a massage", "rested, didn't have a massage", "rested, had a massage"), col=c(col1, col2, col3, col4), pch=16) segments(df[1:7,]$id, df[1:7,]$y, df[1:7,]$id, mean(df[1:7,]$y), col = col1) segments(df[8:14,]$id, df[8:14,]$y, df[8:14,]$id, mean(df[8:14,]$y), col = col2) segments(df[15:21,]$id, df[15:21,]$y, df[15:21,]$id, mean(df[15:21,]$y), col = col3) segments(df[22:28,]$id, df[22:28,]$y, df[22:28,]$id, mean(df[22:28,]$y), col = col4) #b3 segments(7, mean(df[0:7,]$y), 22, mean(df[0:7,]$y), col="blue") segments(21, mean(df[22:28,]$y), 22, mean(df[22:28,]$y), col="blue") segments(21.5, mean(df[0:7,]$y), 21.5, mean(df[22:28,]$y), col="blue") text(21.5, (xmodel.3[1,2]+(combined/2)), paste('b1 + b2 + b3 =', combined), pos = 2, col="blue") } }) #### REPORT YOUR RESULTS 3 T PLOT #### output$tplot3 <- renderPlot({ #### When t0 is selected #### if (inferenceV3$betas =="b0"){ ##Prints t distribution plot curve(dt(x, df=dft), from=-12, to=12, main="t value", xlab = "T-student's distribution", ylab = 'Probability Density') abline(v=b0_t_statistics.2,col="blue") abline(v=(qt(c(.975), df=dft)),col="red") abline(v=-(qt(c(.975), df=dft)),col="red") text(11,0.2, paste('t value =', round(b0_t_statistics.2,2)), pos= 2, col="blue") segments(11, 0.2, 12.5, 0.2, col="blue") segments(12.5, 0.2, 12.25, 0.21, col="blue") segments(12.5, 0.2, 12.25, 0.19, col="blue") text(7,0.1, paste('critical value = [', round(qt(c(.975), df=dft),2), ",", round(-qt(c(.975), df=dft),2), "]"), pos=3, offset=2, col="red") colorArea(from=qt(c(.975), df=dft), to=3.75, dnorm, mean=0, sd=1, col=2, dens=65) colorArea(from= qt(c(.025), df=dft), to= -3.75, dnorm, mean=0, sd=1, col=2, dens=65) } #### When t1 is selected #### else if (inferenceV3$betas =="b1"){ curve(dt(x, df=dft), from=-12, to=12, main="t value", xlab = "T-student's distribution", ylab = 'Probability Density') abline(v=b1_t_statistics.2,col="chartreuse4") abline(v=(qt(c(.975), df=dft)),col="red") abline(v=-(qt(c(.975), df=dft)),col="red") text((b1_t_statistics.2),0.2, paste('t value =', round(b1_t_statistics.2,2)), pos=4, col="chartreuse4") text(7,0.1, paste('critical value = [', round(qt(c(.975), df=dft),2), ",", round(-qt(c(.975), df=dft),2), "]"), pos=3, offset=2, col="red") colorArea(from=qt(c(.975), df=dft), to=3.75, dnorm, mean=0, sd=1, col=2, dens=65) colorArea(from= qt(c(.025), df=dft), to= -3.75, dnorm, mean=0, sd=1, col=2, dens=65) } #### When t2 is selected #### else if (inferenceV3$betas =="b2"){ curve(dt(x, df=dft), from=-12, to=12, main="t value", xlab = "T-student's distribution", ylab = 'Probability Density') abline(v=b2_t_statistics.2,col="chartreuse4") abline(v=(qt(c(.975), df=dft)),col="red") abline(v=-(qt(c(.975), df=dft)),col="red") text((b2_t_statistics.2),0.2, paste('t value =', round(b2_t_statistics.2,2)), pos=2, col="chartreuse4") text(7,0.1, paste('critical value = [', round(qt(c(.975), df=dft),2), ",", round(-qt(c(.975), df=dft),2), "]"), pos=3, offset=2, col="red") colorArea(from=qt(c(.975), df=dft), to=3.75, dnorm, mean=0, sd=1, col=2, dens=65) colorArea(from= qt(c(.025), df=dft), to= -3.75, dnorm, mean=0, sd=1, col=2, dens=65) } #### When t3 is selected #### else if (inferenceV3$betas =="b3"){ curve(dt(x, df=dft), from=-12, to=12, main="t value", xlab = "T-student's distribution", ylab = 'Probability Density') abline(v=b3_t_statistics.3,col="chartreuse4") abline(v=(qt(c(.975), df=dft)),col="red") abline(v=-(qt(c(.975), df=dft)),col="red") text((b3_t_statistics.3),0.2, paste('t value =', round(b3_t_statistics.3,2)), pos=4, col="chartreuse4") text(7,0.1, paste('critical value = [', round(qt(c(.975), df=dft),2), ",", round(-qt(c(.975), df=dft),2), "]"), pos=3, offset=2, col="red") colorArea(from=qt(c(.975), df=dft), to=3.75, dnorm, mean=0, sd=1, col=2, dens=65) colorArea(from= qt(c(.025), df=dft), to= -3.75, dnorm, mean=0, sd=1, col=2, dens=65) } }) #### REPORT YOUR RESULTS 3 PLOT CAPTIONS #### output$plotsLabel3 <- renderUI({ if (inferenceV3$betas =="b0"){ HTML(paste(" The effect here is ", round(xmodel.2[1,2],2), " and represents the expected marathon finishing time for someone who hasn\'t rested nor had a massage. The error for the intercept is ", round(xmodel.2[1,3],2), ". In this sense the test statistics is ", round(xmodel.2[1,2],2), " / ", round(xmodel.2[1,3],2), " = ", round(xmodel.2[1,4],2), ". The likelihood to have a test statistics of ", round(xmodel.2[1,4],2), " or a more extreme test statistic, if H0 (i.e., the value of the intercept isn't different from zero) is true, is ", pvalueb1.2, ". Note: Because the test statistic is so great, it cannot be represented in the t-distribution plot below. ", sep="")) } else if (inferenceV3$betas =="b1"){ HTML(paste(" The effect here is ", round(xmodel.2[2,2],2), " and represents the impact of having rested the day before on finishing time, controlling for the effect of having a massage. In other words, it is the expected difference in finishing time between someone who has neither rested nor had a massage and someone who has rested, but did not have a massage. The error for this difference is ", round(xmodel.2[2,3],2), ". In this sense the test statistics is ", round(xmodel.2[2,2],2), " / ", round(xmodel.2[2,3],2), " = ", round(xmodel.2[2,4],2), ". The likelihood to have a test statistics of ", round(xmodel.2[2,4],2), " or a more extreme test statistic, if H0 (i.e., the value of the difference between these groups isn't different from zero) is true, is ", pvalueb2.2, ". ", sep="")) } else if (inferenceV3$betas =="b2"){ HTML(paste(" The effect here is ", round(xmodel.2[3,2],2), " and represents the impact of having had a massage the day before on finishing time, controlling for the effect of having rested the day before. In other words, it is the expected difference in finishing time between someone who has neither rested nor had a massage and someone who has had a massage, but did not rest. The error for this difference is ", round(xmodel.2[3,3],2), ". In this sense the test statistics is ", round(xmodel.2[3,2],2), " / ", round(xmodel.2[3,3],2), " = ", round(xmodel.2[3,4],2), ". The likelihood to have a test statistics of ", round(xmodel.2[3,4],2), " or a more extreme test statistic, if H0 (i.e., the value of the difference between these groups isn't different from zero) is true, is ", pvalueb3.2, ". ", sep="")) } else if (inferenceV3$betas =="b3"){ HTML(paste(" The effect here is ", round(xmodel.3[4,2],2), " and respresents the impact of the combination of having a massage and resting over and above the individual impacts of both having rested and having a massage. Thus, if we combine all the applicable effects (of having rested, of having a massage, and of the interaction of these two), we can obtain the expected difference in finishing time between someone who has neither rested nor had a massage and someone who has both had rested and had a massage. The error for the isolated effect of the interaction is ", round(xmodel.3[4,3],2), ". In this sense the test statistics is ", round(xmodel.3[4,2],2), " / ", round(xmodel.3[4,3],2), " = ", round(xmodel.3[4,4],2), ". The likelihood to have a test statistics of ", round(xmodel.3[4,4],2), " or a more extreme test statistic, if H0 (i.e., the value of the difference between these groups isn't different from zero) is true, is ", pvalueb4.3, ". ", sep="")) } }) #### REPORT YOUR RESULTS 3 TEXT BOX #### output$reporting1_3 <- renderUI ({ HTML(paste("The unstandardized coefficient way: The results show that the independent variables included in the model explain ", 100 * round(R.3, 4), "% of the variation in ", tolower(ylabel), ". ", x1label, " had a significant effect on ", tolower(ylabel), ", b = ", round(xmodel.3[2,2],2), ", t(", f.3[3], ") = ", round(xmodel.3[2,4],2), ", p ", pvalueb2.3, ". ", x2label, " also had a significant effect on ", tolower(ylabel), ", b = ", round(xmodel.3[3,2],2), ", t(", f.3[3], ") = ", round(xmodel.3[3,4],2), ", p ", pvalueb3.3, ". Finally, there was also a significant interaction between rest and massage, b = ", round(xmodel.3[4,2],2), ", t(", f.3[3],") = ", round(xmodel.3[4,4],2), ", p = ", pvalueb4.3, "." , sep="")) }) betaST3.1 <- round(summary(lm(scale(y) ~ scale(x1) + scale(x2) + scale(x1*x2)))$coefficients[[2]], 2) betaST3.2 <- round(summary(lm(scale(y) ~ scale(x1) + scale(x2) + scale(x1*x2)))$coefficients[[3]], 2) betaST3.3 <- round(summary(lm(scale(y) ~ scale(x1) + scale(x2) + scale(x1*x2)))$coefficients[[4]], 2) output$reporting2_3 <- renderUI ({ HTML(paste("The standardized coefficient way: The results show that the independent variables included in the model explain ", 100 * round(R.3, 4), "% of the variation in ", tolower(ylabel), ". ", x1label, " had a significant effect on ", tolower(ylabel), ", β = ", betaST3.1, ", t(", f.3[3], ") = ", round(xmodel.3[2,4],2), ", p ", pvalueb2.3, ". ", x2label, " also had a significant effect on ", tolower(ylabel), ", β = ", betaST3.2, ", t(", f.3[3], ") = ", round(xmodel.3[3,4],2), ", p ", pvalueb3.3, ". Finally, there was also a significant interaction between rest and massage, β = ", betaST3.3, ", t(", f.3[3],") = ", round(xmodel.3[4,4],2), ", p = ", pvalueb4.3, "." , sep="")) }) output$regressionStats3 <- renderPrint({ summary(lm(y ~ x1 * x2)) }) ############################################ ############# Glossary Table ############### ############################################ term<- c('Null model','Model Error', 'Intercept','Slope','Sum of Squared Errors (SSE)','Standard Error','t statistic','p-value', 'Critical Value','F-statistic') definition <- c("Null model is a model in which ","The error of the model (also knows as the residual) is the difference between the value that is predicted by a model and the value observed in the data. The smaller the error the better the model fits the data ", "In a regression, an intercept is the expected mean value of outcome variable (Y) when all of the predictor varioables are at 0 (X=0).", "The slope of a regression line represents the rate of change in the outcome variable (y) as the predictor variable (x) changes.The greater the magnitude of the slope, the steeper the line and the greater the rate of change.", "Is an estimate of the total spread of a set of observations around a parameter i.e. mean. It is calculated by getting the deviance for each score which is then squarted. The sum of the squared deviance is the sum of squares.", "It provides us a measure of the statistical accuracy of an estimate. Larger values indicate that a given statistic i.e. the mean might not accurately reflect the population from which the sample came from.", "A test statistic with a t-distribution (see t-statistic plot). In linear regression it is used to test whether the model explains the variance. It is calcualted by dividing the coefficient by its standard error. ", "The p-value in a regression each term tests the coefficients are equal to zero (no effect). A low p-value (< 0.05) indicates that you can reject the null hypothesis suggesting that the inclusion of the variable is a meaningful addition to your model.","Critical values are cut-off values that define regions where the test statistic is unlikely to lie if the null hypothesis is true. The critical value depends on the alpha level and the test statistic used with the alpha of 0.05 frequently applied. The null hypothesis is rejected if the test statistic lies within this region", "A test statistic with a known F-distribution (see F distribution plot). It is a ratio of the average variability in the data that a model explains to the average variability of unexplained variance in the same model. It is used to test the overall fit of the model in regression analysis") glossary_data <- data.frame(term, definition) #####table for glossary tab#### output$tablegloss <- shiny::renderDataTable({ ff <- data.frame(term, definition) colnames(ff) <- c("Term", "Definiton") ff }, options = list(searching=F, pageLength=10, lengthChange=F, ordering=T, paging=F, info=F, stateSave=F, order = list(c(0, 'asc'))) , escape=F) ############################################ ########## Test your knowledge ############# ############################################ #### The statistics #### # Create a mode function Mode <- function(x) { ux <- unique(x) ux[which.max(tabulate(match(x, ux)))] } output$tableSum <- renderTable({ dfdata <- data.frame(c("Didn't rest, didn't have massage", "Rested, didn't have massage", "Didn't rest, had massage", "Rested, had massage"), aggregate(y ~ x1+x2, df, mean)[3]) colnames(dfdata) <- c("group", "mean") dfdata }) rvB <- reactiveValues(a1 = 0, a2 = 0, a3 = 0, a4 = 0, a5 = 0) observeEvent(input$answersData, {{rvB$a1 <- 1 }}) observeEvent(input$answersNull, {{rvB$a2 <- 1 }}) observeEvent(input$answersProposed, {{rvB$a3 <- 1 }}) observeEvent(input$answersComparison, {{rvB$a4 <- 1 }}) observeEvent(input$answersInference, {{rvB$a5 <- 1 }}) output$answersText1 <- renderUI({ if(rvB$a1 == 1) { HTML(" Answers: <ol> <li>30 people</li> <li>226 minutes</li> <li>yes</li> <li>no</li> <li>Rest and have a massage</li> <li>Participant 7</li> <li>Participant 11</li> </ol>") } }) output$answersText2 <- renderUI({ if(rvB$a2 == 1) { HTML(" Answers: <ol> <li>54632.3</li> <li>The mean.</li> <li>Individual finishing time.</li> </ol>") } }) output$answersText3 <- renderUI({ if(rvB$a3 == 1) { HTML(" Answers: <ol> <li>44944.16</li> <li>54349.53</li> <li>BMI</li> <li>Hours of training</li> </ol>") } }) output$answersText4 <- renderUI({ if(rvB$a4 == 1) { HTML(" Answers: <ol> <li>30406.18</li> <li>Participant 20 (off by 130.3)</li> <li>Participant 23 (off by 6.7)</li> <li>Participant 29 (off by 56.77)</li> <li>Participant 21 (off by 3.62)</li> </ol>") } }) output$answersText5 <- renderUI({ if(rvB$a5 == 1) { HTML(" Answers: <ol> <li>532.33</li> <li>520.31</li> <li>A t of -5.93</li> <li>29</li> <li>30</li> </ol>") } }) }

#####PACKAGES########## library(shiny) library(ggplot2) library(calibrate) library(MASS) library(shinyBS) library(broom) library(DT) library(shinydashboard) ############################ ###### USER INTERFACE ###### ############################ ui = dashboardPage( dashboardHeader(title = "Model Comparison", titleWidth = 180), dashboardSidebar( width = 180, #################### ### SIDEBAR MENU ### #################### sidebarMenu(id="tabs", menuItem("Introduction", tabName = "intro", icon = icon("question") ), menuItem("Null Model", tabName = "nullmodel", icon = icon("th") ), menuItem("One predictor", tabName = "propmodel1", icon = icon("angle-right") ), menuItem("Two predictors", tabName = "propmodel2", icon = icon("angle-double-right ") ), menuItem("Interactions", tabName = "propmodel3", icon = icon("exchange") ), menuItem("Glossary", tabName = "glossary", icon = icon("list")) )), dashboardBody( tags$style(HTML(" .box-body, .box-header, .nav-tabs-custom, .tab-pane, .nav-tabs-custom > .tab-content, .nav-tabs-custom > .nav-tabs > li.active:hover > a, .nav-tabs-custom > .nav-tabs > li.active > a { background:#EBF4FA; } hr { color: #d2d6de; background: #d2d6de; height: 3px; } ")), includeScript("../../../Matomo-tquant.js"), fluidRow( tags$head( tags$link(rel = "stylesheet", type = "text/css", href = "formatboot.css")), ###################### ## 1st PAGE - INTRO ## ###################### tabItems( tabItem("intro", ### Intro box ### box(width=12, title = h3("WHAT IS THIS ALL ABOUT?",align="center"), collapsible = T, p("In this app, you'll learn about how even somewhat complex models, such as those including an interaction term, still boil down to the basic equation:"), br(), HTML(' Data = model + error '), br(), HTML("Last year, you participated in a marathon with some of your friends. You trained many hours and obtained a good result, but you would like to improve further. This time you're interested in exploring potential relationships between independent variables:"), HTML("<ul> <li>Whether training on the previous day is good or bad for performance;</li> <li>Whether receiving a message on the previous day is good or bad for performance;</li> <li>Whether there is any combination of these two factors that is particularly improves performance.</li> </ul>"), br(), p(img(src = "marathonRunner.jpg", width="50%"), align="middle") ), ### Table with data box ### box(width=12, title = h3(HTML("THE DATASET"), align="middle"), collapsible = T, collapsed = T, HTML('So, let\'s first have a look at the information about last year\'s marathon on our variables of interest.'), br(),br(), shiny::dataTableOutput("tableData")), ### Quiz box ### box(width=12, title = h3("TEST YOUR KNOWLEDGE", align="center"), collapsible = T, collapsed = T, fluidRow( column(8, tags$head( tags$style(type="text/css", "#inline label{ display: table-cell; text-align: center; vertical-align: middle; font-size:110%; padding: 8px} #inline .form-group { display: table-row;}") ), HTML('Here is a challenge for you! Based on the data table above and the summary table on your right, try to answer these questions:'), br(),br(), tags$div(id = "inline", textInput("answer1f", HTML("1.How many people took part in the marathon last year?"))), tags$div(id = "inline", textInput("answer1f", HTML("2.How much time did it take for participant number fifteen to complete the run?"))), tags$div(id = "inline", textInput("answer1f", HTML("3.Did participant 15 rest on the day before the marathon?"))), tags$div(id = "inline", textInput("answer1f", HTML("4.Did participant 16 have a massage on the day before the marathon?"))), tags$div(id = "inline", textInput("answer1f", HTML("5.What should one do on the day before the marathon?"))), tags$div(id = "inline", textInput("answer1f", HTML("6.Who was first to finish the race?"))), tags$div(id = "inline", textInput("answer1f", HTML("7.Who finished last?"))), br()), column(2, br(), htmlOutput("answersText1")), column(2, br(), br(), tableOutput("tableSum")) ), actionButton("answersData", "See answers"), br() ), ### Summary box ### box(width=12, title = h3("SUMMARY", align="center"), collapsible = T, collapsed = T, HTML("What have you learned so far?"), HTML(" The basic recipe for any model is: Data = model + error. By now you should be familiar with the first part of the recipe (Data): <ul> <li>You gained some insights into your research question and how to go to answer it.</li> <li>You have seen the available data, and briefly explored it. But this is just the first step!</li> </ul>"), HTML("What are you going to learn next? <ul> <li>You are going to get to know the second part of the recipe: what is a MODEL? What is ERROR?</li> <li>What is the null model?</li> <li>What are the parameters of the null model?</li> </ul>"), HTML("Research question: You aim at understanding what you should do to improve your marathon performance and you use the available data from last year to help you figure out how to improve.") ) ,p( HTML('<a href="#top">'), actionButton(inputId="Next1", label=HTML('Next page ')), HTML("</a>") , align="right") ), ########################## #### 2nd - Null MODEL #### ########################## tabItem("nullmodel", ### Intro box ### box(width=12, title = h3('YOUR FIRST GUESS',align="center"), collapsible=T, HTML("You are curious about how your friends Tom and Jerry performed last year, but you can't find their name in the dataset in which only participants'numbers are displayed. You don't have many information about them apart from the fact that they were there. What is the best guess you can make about their time? You can now use your recipe to answer this question. The 'i' subscript indicates the individual partcipant. "), HTML(' Datai = model + errori '), HTML("The best guess would probably be the most frequent time people spent to complete the marathon. In statistical language this is called 'the mode', and in special cases (e.g. when data are normally distributed), the mode equals the mean value. Let's assume that this is the case, you calculate the mean time and you find that it was about 4 hours and 15 minutes, so you go for that. "), HTML(' TimeJerry = mean time + errorJerry '), HTML(' TimeTom = mean time + errorTom '), br(), HTML(" When you meet your friends, you decide to actually ask them for the real time. Jerry seems to be quite offended by your guess, he had performed way better! He made it in a little more than 2 hours and a half!! However, Tom was happily surprised by your guess, it took 4 hours and 9 minutes for him! Maybe you can improve your guess (model) by considering more information and adding it into your model. ")), ### Null Model box ### box(width=12, title = h3("WHAT IS AN ERROR?", align="center"), collapsible = T, collapsed = T, HTML("Your prediction was better for Tom than for Jerry, so we can say that you made a greater error at predicting Jerry's time than Tom's. You can look at the next section to get a better grasp of what an error is. Look at the graph below. On the y-axis is represented last year participant's time. On the x-axis you find the participant number (id). Your model (the mean) is represented as a straight line, and it is the time that you would predict for each participant to finish the marathon. For a lot of participants, you made an error using this model. The red line represents the error you commited in your prediction for Jerry's time, whereas the green line is the error associated with the time you predicted for Tom. As you can see the model is not that good. Maybe you can improve your guess (model) by considering more information to add into your model. "), br(),br(), fluidRow( column(6, shiny::dataTableOutput("tableNull")), column(6, radioButtons(inputId = "plot_type_null", label = "How do you wish to visualize it?", c("Scatterplot"="scatter","Histogram(SSE)"="hist")), plotOutput('plotNull'), textOutput("Cap") ))) , #####Quiz box###### box(width=12, title=h3("TEST YOUR KNOWLEDGE", align="center"), collapsible=T, collapsed=T, column(9, tags$div(id = "inline", textInput("answer1f", HTML("1.What is the sum of standard error when the null model is fitted?"))), tags$div(id = "inline", textInput("answer1f", HTML("2.What estimate is used to fit the null model?"))), tags$div(id = "inline", textInput("answer1f", HTML("3.What is your model trying to predict?"))), actionButton("answersNull", "See answers") ), column(3, htmlOutput("answersText2")) ), ### Summary box ### box(width=12, title=h3("SUMMARY", align="center"), collapsible=T, collapsed=T, HTML("What have you learned so far? By now you should be familiar with the null model and the error in the model. The basic recipe for any model is: Datai = model + errori."), HTML("What are you going to learn next? <ul> <li>What is the proposed model?</li> <li>What are the parameters of the proposed model?</li> <li>How good is the model?</li> </ul> "), HTML("Research question: Don't forget our research question! You want to understand what is important in order to succeed in a marathon. ")), p( HTML('<a href="#top">'), actionButton(inputId="Next2", label=HTML('Next page ')), HTML("</a>") , align="right") ), ############################ ## 3rd - PROPOSED MODEL 1 ## ############################ tabItem("propmodel1", tabBox( tabPanel("One predictor", ### Tab1, Intro box ### box(width=14, title=h3("IMPROVING YOUR GUESSES", align="center"), collapsible=T, HTML(" As you saw in the previous section, your model was not very good at predicting your friends'performance. How can you improve it? In the dataset of last year's marathon, you had information about whether each participant rested or had a massage on the day before the marathon. You decide to start by looking if resting on the day before the marathon might lead to better performance in the marathon. You can now build a new model with this information and see if your predictions improve: "), h4("Marathon Finish Time = Intercept + Effect of rest + Error", align="middle"), HTML('Yi = b0 + b1*x1 + Error'), HTML('In which Yi = the outcome variable, b0 = the intercept or b0 coefficient, b1 = the slope or b1 coefficient, and X1 = the predictor variable.'), br(), HTML(' Intercept (b0): You assume that the relationship between the two variables (time and rest) can be described with a line. The intercept represents the performance (i.e., value on the y-axis) for someone who has not rested on the day before the marathon (i.e., has a value of 0 on the x-axis).'), br(), HTML(' Effect of resting on the day before the marathon (b1): You would like to see if resting on the day before the marathon is important for succeeding in the marathon. The answer to this question lies in the b1 coefficient. As rest is a dichotomous variable (i.e., a nominal/categorical variable with only two categories), in which 0 = no rest, 1 = rest, the slope represents the difference in the mean marathon performance between these two groups of participants (i.e., those who rested vs. those who did not rest).' )), ### Tab1, Proposed model box ### box(width=14, title = h3("FITTING THE MODEL TO THE DATA", align="center"), collapsible = TRUE, collapsed = T, HTML("Now you can visually evaluate the length of the line. The model seems to better predict the performance of both your friends (the error is reduced). But most of all, it seems that the predictor 'Rested day before' is significantly related to the outcome variable (performance at the marathon). To know if that's the case you will have to compute a significance test, which you will find in the inference section. At this point, you're interested in including more variables and relationships in your model, so please go to the next tab in this page, labeled \"More complex models\". "), br(), br(), fluidRow( column(6, shiny::dataTableOutput('tableProposed')), column(6, radioButtons(inputId = "plot_type_prop1", label = "How do you wish to visualize it?", c("Scatterplot"="scatter","Histogram"="hist")), plotOutput('plotProposed'), textOutput('Cap.P1') )) ) ), tabPanel("Model Comparison", box(width=14, title = h3("EVALUATING YOUR FIRST MODEL - Part I", align = "center"), collapsible = T, HTML('Let us now directly compare the null model (mean finishing time) and the model with your best predictor - resting the day before. We will do so by visualising the null and the proposed model and how the additon of the predictor improves your model by visualising the reduction in sum squared errors (SSE).'), br(), HTML('Remember that you are comparing the following models:'), br(), br(), h4("Marathon Finish Time = Mean Finishing Time + Error", align="middle"), br(), h4("Marathon Finish Time = Intercept + Effect of rest + Error", align="middle"), br()), #### MODEL DETAILS #### box(width = 14, title = h3("MODEL DETAILS", align = "center"), collapsible = T, collapsed = T, HTML('In the graph below you can now see the two models: the red line, representing the Proposed Model 1, and the blue line, representing the null model. Which one of the two models allows you to make better predictions?'), br(),br(), fluidRow( column(7, shiny::dataTableOutput('tableComparison')), column(5, br(), fluidRow( column(4, p(actionButton("null", "Null Model",style='padding:12px; font-size:110%; background-color: #337ab7; font-weight: bold; color:#FFFFFF; border-color:#000000'), align="center") ), column(4, p(actionButton("full", "Proposed Model", style='padding:12px; font-size:110%; background-color: #FA8072; font-weight: bold; color:#FFFFFF; border-color:#000000'), align="center") ), column(4, p(actionButton("sse", "Error reduction", style='padding:12px; font-size:110%; background-color: grey; font-weight: bold; color:#FFFFFF; border-color: #000000; border-weight:bold'), align="center") ) ), plotOutput('plotComparison'), textOutput('Cap.Com') )), column(12, br(), br(), HTML("You can answer this question more precisely if you are able to quantify which model has the least error. You can take each participant's time and subtract the predicted time based on your model. You do this for each participant, and sum up the results. What do you obtain? The result is zero! This does not help, in fact to avoid cancelling out opposite values and in order to give each one a weight, we have to square each difference before summing them up. At this point, we have the so called 'sum of squared errors (SSE)', which we need to divide for the number of participants to obtain the 'mean sum of squared errors' (MSSE). To compare how good each model is, we need to calculate the so called 'R-squared' which is the ratio between the explained variation that the model explains (SSM) and the total variation (sum of SSE and SSM)." ), br() )), p( HTML('<a href="#top">'), actionButton(inputId="Next4", label=HTML('Next page ')), HTML("</a>") , align="right") ), tabPanel("Regression vs. ANOVA", box(width=14, h3("REGRESSION vs ONE-WAY ANOVA", align="middle"), p("Now we wil explore two cases of General Linear Models: a simple regression and a One-Way ANOVA"), HTML("A very simple explanation is the following: ►Regression is the statistical model that you use to predict a continuous outcome on the basis of one or more continuous predictor variables ► The One-way ANOVA is the statistical model that you use to predict a continuous outcome on the basis of one or more categorical predictor variables.")), box(width=14, h3("Similarities", align="center"), HTML ("1. First, both models are applicable only when you have a continuous outcome variable. A categorical outcome variable would rule out the use of either a regression model or a One-way ANOVA."), HTML ("2. Second, you can use the regression algorithm, which is based on the principle of least squares, to fit an ANOVA model. "), HTML ("3. Third, the concept of partitioning variation into sums of squares (SS) in an ANOVA model also provides a nice way to examine complex regression models. "), HTML(" In an ANOVA: the categorical variable is effect coded, which means that each categosy's mean is compared to the grand mean"), HTML(" In a Regression: the categorical variable is dummy coded, which means that each category's intercept is compared to the reference group's intercept. Since the intercept is defined as the mean value when all the other predictors are equal to zero, and there are no other predictors, the three intercepts are just means."), br (), h4 ("Look at the outputs below"), HTML ("The intercept in the regression model is simply the mean of the reference group (i.e. the American; see the ANOVA table). The coefficients for the other two groups are the differences in the mean between the reference group and the other groups (see the computations in the Regression table) "), br (), HTML ("<center> The one-way ANOVA reports each mean and a p-value that shows whether these means are significantly different from each other. Similarly, a regression reports only one mean (as an intercept) and the difference between that one mean and other means, with p-value evaluating these specific comparison. </center>"), HTML ("<center> Both the F- and p-values for the ANOVA and Regression model are identical (34.54 and <0.001 respecitvely) </center>"), uiOutput("R2equation_m1") ), box(width=14, h3("Differences", align="center"), HTML("The One-way ANOVA is a tool to check how much the residual variance is reduced by predictors in (nested regression) models, whereas the regression analysis aims to quantify effect sizes in terms of 'how much is the response expected to change when the predictor(s) change by a given amount?' ►For categorical predictors this reduces to the question to 'what is the expected difference in the response between different groups/categories?' ► For continuous predictors this is the questions for a slope"), p("To clarify: a One-way ANOVA can be applied to any regression model (no matter if the model contains only continuous, only categorical, or both kinds of predictors). A One-way ANOVA allows to assess the impact of a predictor or a whole set of predictors on the residuals: who much of the variance in the data can be explained by these predictors? The regression analysis, on the other hand, is a complementary tool to asses the quantitative relation between a predictor and the response.") ), box(width=6, HTML("<center>ANOVA Table</center>"), shiny::dataTableOutput('tableAnova1')), box(width=6, HTML("<center>Regression Table</center>"), shiny::dataTableOutput('tableReg1')), box(width=6, HTML("<center>ANOVA Output from R</center>"), verbatimTextOutput("anovaStats1")), box(width=6, HTML("<center>Regression Output from R</center>"), verbatimTextOutput("regStats1")), style='width:100%;'), #### Tab 4, X3 #### tabPanel("Report your results", box(width=14, title = h3("EVALUATING YOUR FIRST MODEL - Part II", align="center"), h4("Test statistic = Estimated effect / Error of the estimation", align="middle"), collapsible = T, br(), p("This is the basic equation underlying test statistics. Test statistics basically represent an effect in the unit if a specific theoretical distribution (here, T-Student and F distributions). When we compute a test statistics we can evaluate how likely that value (or a more extreme value) is to occur. And with this we can say if an effect is happening by chance or not."), br(), conditionalPanel(condition="input.showTstat==0", p(actionButton(inputId="showTstat", label="Press to see the relevant test statistics"), align="right") ), uiOutput("introCoeff"), DT::dataTableOutput("tableTS"), uiOutput("R2"), conditionalPanel(condition="input.select_button", uiOutput("plotsLabel"), fluidRow( column(6, plotOutput("betaplot")), column(6, plotOutput("tplot")) )), conditionalPanel(condition="input.select_button" )), ### Reporting results box ### box(width=14, title = h3("REPORTING RESULTS", align="center"), collapsible = TRUE, collapsed = T, p("Here's how R would print the results of our regression:"), verbatimTextOutput("regressionStats"), HTML('How would you report your results <a href="http://apastyle.org/">(APA style)</a>? The important effect here is given by the b1. b0 only tests the difference between the group that is coded 0 in our predictor variable (in the current case, no rest) and zero, so it is usually of little interest. So the report of the results will be on these this parameter. There are two different ways to do this:'), hr(), uiOutput("reporting1"), hr(), uiOutput("reporting2"), hr(), HTML("What about R2? Shouldn't we be reporting R2 as well?"), uiOutput("RtoPrint") )), tabPanel("Next you will learn...") ), style='width:200%;'), ############################ ## 4th - PROPOSED MODEL 2 ## ############################ tabItem("propmodel2", tabBox( tabPanel("Two predictors", box(width=14, title=h3("FURTHER IMPROVING YOUR GUESSES", align="center"), collapsible = T, HTML("<center>When you use the information about whether or not the participants had a massage on the day before to estimate their marathon finishing time, the model looks like this: "), h4("Marathon Finish Time = Intercept + Effect of rest + Effect of massage + Error", align="middle"), HTML('Yi = b0 + b1*x1 + b2*x2 + Error</center>')), box(width=14, title = h3("FITTING THE MODEL TO THE DATA", align="center"), collapsible = TRUE, collapsed = T, HTML('Below, you can find a table and graph for the proposed model with two predictors (rest and massage). Notice how instead of one line, as in the null model, or two lines, as in the proposed model with only one predictor, we now have four lines, each representing the expected value for each combination of the two dichotomous predictor variables. For example, a participant who rested and who had a massage on the day before the marathon is expected to have a worse performance (i.e., is expected to take longer) than a participant who didn\'t rest and had a massage. '), fluidRow( column(6, shiny::dataTableOutput('tableProposed.2')), column(6, radioButtons(inputId = "plot_type_prop2", label = "How do you wish to visualize it?", c("Scatterplot"="scatter","Histogram (SSE)"="hist")), plotOutput('plotProposed.2'), textOutput('Cap.P2') ) ), br(), br(), HTML("Is this model better or worse than the previous one, in which we used only rest as a predictor?") )), tabPanel("Model Comparison", box(width=14, title = h3("EVALUATING YOUR SECOND PROPOSED MODEL - Part I", align = "center"), collapsible = T, HTML('Let us now directly compare the null model (mean finishing time) and the model with two predictors - resting the day before and having a massage. We will do so by visualising the null and the proposed model with two predictors and how the additon of these predictors improve your model by visualising the reduction in sum squared errors (SSE).'), br(), HTML('Remember that you are comparing the following models:'), br(), br(), h4("Marathon Finish Time = Mean Finishing Time + Error", align="middle"), br(), h4("Marathon Finish Time = Intercept + Effect of rest + Effect of massage + Error", align="middle"), br()), #### MODEL DETAILS #### box(width = 14, title = h3("MODEL DETAILS", align = "center"), collapsible = T, collapsed = T, HTML('In the graph below you can now see two models: the blue line, representing the null model, and a set of four lines: purple, red, yellow, and green, representing the proposed model with two predictors. Remember that the four lines in the proposed model now represent the expected value depending on the combination of values of the participant in the two predictor variables. By clicking the buttons above the graph you can see how the distances change depending on which model (i.e., line or set of lines) is used to predict the data. '), br(),br(), fluidRow( column(7, shiny::dataTableOutput('tableComparison2')), column(5, br(), fluidRow( column(4, p(actionButton("null2", "Null Model",style='padding:12px; font-size:110%; background-color: #337ab7; font-weight: bold; color:#FFFFFF; border-color:#000000'), align="center") ), column(4, p(actionButton("full2", "Proposed Model", style='padding:12px; font-size:110%; background-color: #FA8072; font-weight: bold; color:#FFFFFF; border-color:#000000'), align="center") ), column(4, p(actionButton("sse2", "Error reduction", style='padding:12px; font-size:110%; background-color: grey; font-weight: bold; color:#FFFFFF; border-color: #000000; border-weight:bold'), align="center") ), plotOutput('plotComparison2'), br(),br(), br(),br(), textOutput('Cap.Com2') ))), HTML(" Based on the scatterplot and the histogram, which one of the two models allows you to make better predictions?") )), tabPanel("Report your results", box(width=14, title = h3("EVALUATING YOUR SECOND PROPOSED MODEL - Part II", align="center"), h4("Test statistic = Estimated effect / Error of the estimation", align="middle"), collapsible = T, br(), p("This is the basic equation underlying test statistics. Test statistics basically represent an effect in the unit if a specific theoretical distribution (here, T-Student and F distributions). When we compute a test statistics we can evaluate how likely that value (or a more extreme value) is to occur. And with this we can say if an effect is happening by chance or not."), br(), conditionalPanel(condition="input.showTstat2==0", p(actionButton(inputId="showTstat2", label="Press to see the relevant test statistics"), align="right") ), uiOutput("introCoeff2"), DT::dataTableOutput("tableTS2"), uiOutput("R2.2"), conditionalPanel(condition="input.select_button2", uiOutput("plotsLabel2"), fluidRow( column(6, plotOutput("betaplot2")), column(6, plotOutput("tplot2")) )), conditionalPanel(condition="input.select_button2" )), ### Reporting results box ### box(width=14, title = h3("REPORTING RESULTS", align="center"), collapsible = TRUE, collapsed = T, p("Here's how R would print the results of our regression:"), verbatimTextOutput("regressionStats2"), HTML('How would you report your results <a href="http://apastyle.org/">(APA style)</a>? The important effects here are given by the b1 and the b2 parameters. b0 only tests the difference between the group that is coded 0 in all dichotomous variables (in the current case, no rest and no massage) and zero, so it is usually of little interest. So the report of the results will be on these two parameters. There are two different ways to do this:'), hr(), uiOutput("reporting1_2"), hr(), uiOutput("reporting2_2") ) ), tabPanel("Next you will learn...") ), style='width:200%;'), ############################ ## 5th - PROPOSED MODEL 3 ## ############################ tabItem("propmodel3", tabBox( tabPanel("Interaction", box(width=14, title=h3("FURTHER IMPROVING YOUR GUESSES", align="center"), collapsible = T, HTML("<center>When you use the information about both predictor and add the interaction between them to estimate marathon finishing time, the model looks like this: "), h4("Marathon Finish Time = Intercept + Effect of rest + Effect of massage + (Effect of rest * Effect of massage) + Error", align="middle"), HTML('Yi = b0 + b1*x1 + b2*x2 + b3(x1*x2) + Error</center>')), box(width=14, title = h3("FITTING THE MODEL TO THE DATA", align="center"), collapsible = TRUE, collapsed = T, fluidRow( column(6, shiny::dataTableOutput('tableProposed.3')), column(6, radioButtons(inputId = "plot_type_prop3", label = "How do you wish to visualize it?", c("Scatterplot"="scatter","Histogram (SSE)"="hist")), plotOutput('plotProposed.3'), textOutput('Cap.P3') ) ), br(), br(), HTML("Is this model better or worse than the previous one, in which we did not consider the interaction between both variables?") )), tabPanel("Model Comparison", box(width=14, title = h3("EVALUATING YOUR THIRD PROPOSED MODEL - Part I", align = "center"), collapsible = T, HTML('Let us now directly compare the null model and the model with two predictors and the interaction between them. We will do so by visualising both models and how the additon of the interaction improves your model by visualising the reduction in sum squared errors (SSE).'), br(), HTML('Remember that you are comparing the following models:'), br(), br(), h4("Marathon Finish Time = Mean Finishing Time + Error", align="middle"), br(), h4("Marathon Finish Time = Intercept + Effect of rest + Effect of massage + (Effect of rest * Effect of massage) + Error", align="middle"), br()), #### MODEL DETAILS #### box(width = 14, title = h3("MODEL DETAILS", align = "center"), collapsible = T, collapsed = T, #HTML('In the graph below you can now see two models: the blue line, representing the null model, and a set of four lines: purple, red, yellow, and green, representing the proposed model with two predictors. Remember that the four lines in the proposed model now represent the expected value depending on the combination of values of the participant in the two predictor variables. # By clicking the buttons above the graph you can see how the distances change depending on which model (i.e., line or set of lines) is used to predict the data. '), br(),br(), fluidRow( column(7, shiny::dataTableOutput('tableComparison3')), column(5, br(), fluidRow( column(4, p(actionButton("null3", "Null Model",style='padding:12px; font-size:110%; background-color: #337ab7; font-weight: bold; color:#FFFFFF; border-color:#000000'), align="center")), column(4, p(actionButton("full3", "Proposed Model", style='padding:12px; font-size:110%; background-color: #FA8072; font-weight: bold; color:#FFFFFF; border-color:#000000'), align="center") ), column(4, p(actionButton("sse3", "Error reduction", style='padding:12px; font-size:110%; background-color: grey; font-weight: bold; color:#FFFFFF; border-color: #000000; border-weight:bold'), align="center") ), plotOutput('plotComparison3'), br(),br(), br(),br(), textOutput('Cap.Com3') ))), HTML(" Based on the scatterplot and the histogram, which one of the models allows you to make better predictions?") )), tabPanel("Regression vs. ANOVA", box(width=14, h3("REGRESSION vs TWO-WAY ANOVA", align="middle"), p("Now we wil explore two cases of General Linear Models: a multiple regression and a two-way ANOVA"), HTML("A very simple explanation is the following: ► Multiple regression is the statistical model that you use to predict a continuous outcome on the basis of more than one continuous predictor variable. ► The two-way ANOVA is the statistical model that you use to predict a continuous outcome on the basis of two categorical predictor variables.")), box(width=14, h3("Similarities", align="center"), HTML ("1. First, both models are applicable only when you have a continuous outcome variable. A categorical outcome variable would rule out the use of both a regression model or an ANOVA."), HTML ("2. Second, you can use the regression algorithm, which is based on the principle of least squares, to fit an ANOVA model. "), HTML ("3. Third, the concept of partitioning variation into sums of squares (SS) in an ANOVA model also provides a nice way to examine complex regression models. "), HTML(" In an ANOVA: the categorical variable is effect coded, which means that each categosy's mean is compared to the grand mean."), HTML(" In a regression: the categorical variable is dummy coded, which means that each category's intercept is compared to the reference group's intercept. Since the intercept is defined as the mean value when all the other predictors are equal to zero, and there are no other predictors, the three intercepts are just means."), br (), h4 ("Look at the outputs below"), HTML ("The intercept in the regression model is simply the mean of the reference group (i.e. the American; see the ANOVA table). The coefficients for the other two groups are the differences in the mean between the reference group and the other groups (see the computations in the Regression table) "), br (), HTML ("<center> The one-way ANOVA reports each mean and a p-value that shows whether these means are significantly different from each other. Similarly, a regression reports only one mean (as an intercept) and the difference between that one mean and other means, with p-value evaluating these specific comparison. </center>"), HTML ("<center> Both the F- and p-values for the ANOVA and Regression model are identical (34.54 and <0.001 respecitvely) </center>"), uiOutput("R2equation_m2") ), box(width=14, h3("Differences", align="center"), HTML("The One-way ANOVA is a tool to check how much the residual variance is reduced by predictors in (nested regression) models, whereas the regression analysis aims to quantify effect sizes in terms of 'how much is the response expected to change when the predictor(s) change by a given amount?' ►For categorical predictors this reduces to the question to 'what is the expected difference in the response between different groups/categories?' ► For continuous predictors this is the questions for a slope"), p("To clarify: a One-way ANOVA can be applied to any regression model (no matter if the model contains only continuous, only categorical, or both kinds of predictors). A One-way ANOVA allows to assess the impact of a predictor or a whole set of predictors on the residuals: who much of the variance in the data can be explained by these predictors? The regression analysis, on the other hand, is a complementary tool to asses the quantitative relation between a predictor and the response.") ), box(width=6, HTML("<center>ANOVA Table</center>"), shiny::dataTableOutput('tableAnova2')), box(width=6, HTML("<center>Regression Table</center>"), shiny::dataTableOutput('tableReg2')), box(width=6, HTML("<center>ANOVA Output from R</center>"), verbatimTextOutput("anovaStats2")), box(width=6, HTML("<center>Regression Output from R</center>"), verbatimTextOutput("regStats2")), style='width:100%;'), tabPanel("Report your results", box(width=14, title = h3("EVALUATING YOUR THIRD PROPOSED MODEL - Part II", align="center"), h4("Test statistic = Estimated effect / Error of the estimation", align="middle"), collapsible = T, br(), p("This is the basic equation underlying test statistics. Test statistics basically represent an effect in the unit if a specific theoretical distribution (here, T-Student and F distributions). When we compute a test statistics we can evaluate how likely that value (or a more extreme value) is to occur. And with this we can say if an effect is happening by chance or not."), br(), conditionalPanel(condition="input.showTstat3==0", p(actionButton(inputId="showTstat3", label="Press to see the relevant test statistics"), align="right") ), uiOutput("introCoeff3"), DT::dataTableOutput("tableTS3"), uiOutput("R2.3"), conditionalPanel(condition="input.select_button3", uiOutput("plotsLabel3"), fluidRow( column(6, plotOutput("betaplot3")), column(6, plotOutput("tplot3")) )), conditionalPanel(condition="input.select_button3" )), ### Reporting results box ### box(width=14, title = h3("REPORTING RESULTS", align="center"), collapsible = TRUE, collapsed = T, p("Here's how R would print the results of our regression:"), verbatimTextOutput("regressionStats3"), HTML('How would you report your results <a href="http://apastyle.org/">(APA style)</a>? The important effects here are given by b1, b2, and the interaction parameters. Again, b0 only tests the difference between the group that is coded 0 in all dichotomous variables (in the current case, no rest and no massage) and zero, so it is usually of little interest. So the report of the results will be on these three parameters. There are two different ways to do this:'), hr(), uiOutput("reporting1_3"), hr(), uiOutput("reporting2_3") ) ), tabPanel("Summary...") ), style='width:200%;'), ############## ## GLOSSARY ## ############## tabItem(tabName = "glossary", box(width=12, title = h3("GLOSSARY", align="center"), collapsible = T, p("Here you will find the definitions of the key statistical terms that you have covered whilst using this app.", align="center"), br(), shiny::dataTableOutput("tablegloss")), box(width = 12, title = h3("RECOMMENDED READING", align="center"), collapsible = T, HTML('Judd, C. M., & McClelland, G. H. (2017). Data analysis: A model comparison approach (3rd ed.). San Diego, CA: Harcourt, Brace, Jovanovich.') ) ) ))))

.main-header .logo { font-size: 125%; } .bootpop { color: #333333 !important; text-decoration: none !important; border-bottom: 1px dotted black; } .actionclick { cursor:pointer; text-decoration: none !important; } #button_11 { background-color: #337ab7; } #button_21 { background-color: #00CD00; } ul { padding: 15px; } ol { padding: 15px; }