What type of ANOVA should you run if you have multiple independent variables?

ANOVA stands for analysis of variance and tests for differences in the effects of independent variables on a dependent variable. A two-way ANOVA test is a statistical test used to determine the effect of two nominal predictor variables on a continuous outcome variable.

A two-way ANOVA tests the effect of two independent variables on a dependent variable. A two-way ANOVA test analyzes the effect of the independent variables on the expected outcome along with their relationship to the outcome itself. Random factors would be considered to have no statistical influence on a data set, while systematic factors would be considered to have statistical significance.

By using ANOVA, a researcher is able to determine whether the variability of the outcomes is due to chance or to the factors in the analysis. ANOVA has many applications in finance, economics, science, medicine, and social science.

Key Takeaways

  • A two-way ANOVA is an extension of the one-way ANOVA (analysis of variances) that reveals the results of two independent variables on a dependent variable.
  • A two-way ANOVA test is a statistical technique that analyzes the effect of the independent variables on the expected outcome along with their relationship to the outcome itself.
  • ANOVA has many applications in finance, economics, science, medicine, and social science.

Understanding Two-Way ANOVA

An ANOVA test is the first step in identifying factors that influence a given outcome. Once an ANOVA test is performed, a tester may be able to perform further analysis on the systematic factors that are statistically contributing to the data set's variability.

A two-way ANOVA test reveals the results of two independent variables on a dependent variable. ANOVA test results can then be used in an F-test, a statistical test used to determine whether two populations with normal distributions share variances or a standard deviation, on the significance of the regression formula overall.

Analysis of variances is helpful for testing the effects of variables on one another. It is similar to multiple two-sample t-tests. However, it results in fewer type 1 errors and is appropriate for a range of issues. An ANOVA test groups differences by comparing the means of each group and includes spreading out the variance across diverse sources. It is employed with subjects, test groups, between groups and within groups.

One-Way ANOVA vs. Two-Way ANOVA

There are two main types of analysis of variance: one-way (or unidirectional) and two-way (bidirectional). One-way or two-way refers to the number of independent variables in your analysis of variance test. A one-way ANOVA evaluates the impact of a sole factor on a sole response variable. It determines whether the observed differences between the means of independent (unrelated) groups are explainable by chance alone, or whether there are any statistically significant differences between groups.

A two-way ANOVA is an extension of the one-way ANOVA. With a one-way, you have one independent variable affecting a dependent variable. With a two-way ANOVA, there are two independents. For example, a two-way ANOVA allows a company to compare worker productivity based on two independent variables, such as department and gender. It is utilized to observe the interaction between the two factors. It tests the effect of two factors at the same time.

A three-way ANOVA, also known as three-factor ANOVA, is a statistical means of determining the effect of three factors on an outcome.

Although it was working quite well and applicable to different projects with only minor changes, I was still unsatisfied with another point.

Someone who is proficient in statistics and R can read and interpret the output of a t-test without any difficulty. However, as you may have noticed with your own statistical projects, most people do not know what to look for in the results and are sometimes a bit confused when they see so many graphs, code, output, results and numeric values in a document. They are quite easily overwhelmed by this mass of information and unable to extract the key message.

With my old R routine, the time I was saving by automating the process of t-tests and ANOVA was (partially) lost when I had to explain R outputs to my students so that they could interpret the results correctly. Although most of the time it simply boiled down to pointing out what to look for in the outputs (i.e., p-values), I was still losing quite a lot of time because these outputs were, in my opinion, too detailed for most real-life applications and for students in introductory classes. In other words, too much information seemed to be confusing for many people so I was still not convinced that it was the most optimal way to share statistical results to nonscientists.

Of course, they came to me for statistical advices, so they expected to have these results and I needed to give them answers to their questions and hypotheses. Nonetheless, I wanted to find a better way to communicate these results to this type of audience, with the minimum of information required to arrive at a conclusion. No more and no less than that.

After a long time spent online trying to figure out a way to present results in a more concise and readable way, I discovered the {ggpubr} package. This package allows to indicate the test used and the p-value of the test directly on a ggplot2-based graph. It also facilitates the creation of publication-ready plots for non-advanced statistical audiences.

After many refinements and modifications of the initial code (available in this article), I finally came up with a rather stable and robust process to perform t-tests and ANOVA for more than one variable at once, and more importantly, make the results concise and easily readable by anyone (statisticians or not).

A graph is worth a thousand words, so here are the exact same tests than in the previous section, but this time with my new R routine:

library(ggpubr)

# Edit from here #
x <- which(names(dat) == "Species") # name of grouping variable
y <- which(names(dat) == "Sepal.Length" # names of variables to test
| names(dat) == "Sepal.Width" |
  names(dat) == "Petal.Length" |
  names(dat) == "Petal.Width")
method <- "t.test" # one of "wilcox.test" or "t.test"
paired <- FALSE # if paired make sure that in the dataframe you have first all individuals at T1, then all individuals again at T2
# Edit until here


# Edit at your own risk
for (i in y) {
  for (j in x) {
    ifelse(paired == TRUE,
      p <- ggpaired(dat,
        x = colnames(dat[j]), y = colnames(dat[i]),
        color = colnames(dat[j]), line.color = "gray", line.size = 0.4,
        palette = "npg",
        legend = "none",
        xlab = colnames(dat[j]),
        ylab = colnames(dat[i]),
        add = "jitter"
      ),
      p <- ggboxplot(dat,
        x = colnames(dat[j]), y = colnames(dat[i]),
        color = colnames(dat[j]),
        palette = "npg",
        legend = "none",
        add = "jitter"
      )
    )
    #  Add p-value
    print(p + stat_compare_means(aes(label = paste0(..method.., ", p-value = ", ..p.format..)),
      method = method,
      paired = paired,
      # group.by = NULL,
      ref.group = NULL
    ))
  }
}

What type of ANOVA should you run if you have multiple independent variables?
What type of ANOVA should you run if you have multiple independent variables?
What type of ANOVA should you run if you have multiple independent variables?
What type of ANOVA should you run if you have multiple independent variables?

As you can see from the graphs above, only the most important information is presented for each variable:

  • a visual comparison of the groups thanks to boxplots
  • the name of the statistical test
  • the p-value of the test

Of course, experts may be interested in more advanced results. However, this simple yet complete graph, which includes the name of the test and the p-value, gives all the necessary information to answer the question: “Are the groups different?”.

In my experience, I have noticed that students and professionals (especially those from a less scientific background) understand way better these results than the ones presented in the previous section.

The only lines of code that need to be modified for your own project is the name of the grouping variable (Species in the above code), the names of the variables you want to test (

raw_pvalue <- numeric(length = length(1:4))
for (i in (1:4)) {
  raw_pvalue[i] <- t.test(dat[, i] ~ dat$Species,
    paired = FALSE,
    alternative = "two.sided"
  )$p.value
}

df <- data.frame(
  Variable = names(dat[, 1:4]),
  raw_pvalue = round(raw_pvalue, 3)
)

df$Bonferroni <-
  p.adjust(df$raw_pvalue,
    method = "bonferroni"
  )
df$BH <-
  p.adjust(df$raw_pvalue,
    method = "BH"
  )
df$Holm <-
  p.adjust(df$raw_pvalue,
    method = "holm"
  )
df$Hochberg <-
  p.adjust(df$raw_pvalue,
    method = "hochberg"
  )
df$Hommel <-
  p.adjust(df$raw_pvalue,
    method = "hommel"
  )
df$BY <-
  round(p.adjust(df$raw_pvalue,
    method = "BY"
  ), 3)
df
0,
raw_pvalue <- numeric(length = length(1:4))
for (i in (1:4)) {
  raw_pvalue[i] <- t.test(dat[, i] ~ dat$Species,
    paired = FALSE,
    alternative = "two.sided"
  )$p.value
}

df <- data.frame(
  Variable = names(dat[, 1:4]),
  raw_pvalue = round(raw_pvalue, 3)
)

df$Bonferroni <-
  p.adjust(df$raw_pvalue,
    method = "bonferroni"
  )
df$BH <-
  p.adjust(df$raw_pvalue,
    method = "BH"
  )
df$Holm <-
  p.adjust(df$raw_pvalue,
    method = "holm"
  )
df$Hochberg <-
  p.adjust(df$raw_pvalue,
    method = "hochberg"
  )
df$Hommel <-
  p.adjust(df$raw_pvalue,
    method = "hommel"
  )
df$BY <-
  round(p.adjust(df$raw_pvalue,
    method = "BY"
  ), 3)
df
1, etc.), whether you want to apply a t-test (
raw_pvalue <- numeric(length = length(1:4))
for (i in (1:4)) {
  raw_pvalue[i] <- t.test(dat[, i] ~ dat$Species,
    paired = FALSE,
    alternative = "two.sided"
  )$p.value
}

df <- data.frame(
  Variable = names(dat[, 1:4]),
  raw_pvalue = round(raw_pvalue, 3)
)

df$Bonferroni <-
  p.adjust(df$raw_pvalue,
    method = "bonferroni"
  )
df$BH <-
  p.adjust(df$raw_pvalue,
    method = "BH"
  )
df$Holm <-
  p.adjust(df$raw_pvalue,
    method = "holm"
  )
df$Hochberg <-
  p.adjust(df$raw_pvalue,
    method = "hochberg"
  )
df$Hommel <-
  p.adjust(df$raw_pvalue,
    method = "hommel"
  )
df$BY <-
  round(p.adjust(df$raw_pvalue,
    method = "BY"
  ), 3)
df
2) or Wilcoxon test (
raw_pvalue <- numeric(length = length(1:4))
for (i in (1:4)) {
  raw_pvalue[i] <- t.test(dat[, i] ~ dat$Species,
    paired = FALSE,
    alternative = "two.sided"
  )$p.value
}

df <- data.frame(
  Variable = names(dat[, 1:4]),
  raw_pvalue = round(raw_pvalue, 3)
)

df$Bonferroni <-
  p.adjust(df$raw_pvalue,
    method = "bonferroni"
  )
df$BH <-
  p.adjust(df$raw_pvalue,
    method = "BH"
  )
df$Holm <-
  p.adjust(df$raw_pvalue,
    method = "holm"
  )
df$Hochberg <-
  p.adjust(df$raw_pvalue,
    method = "hochberg"
  )
df$Hommel <-
  p.adjust(df$raw_pvalue,
    method = "hommel"
  )
df$BY <-
  round(p.adjust(df$raw_pvalue,
    method = "BY"
  ), 3)
df
3) and whether the samples are paired or not (
raw_pvalue <- numeric(length = length(1:4))
for (i in (1:4)) {
  raw_pvalue[i] <- t.test(dat[, i] ~ dat$Species,
    paired = FALSE,
    alternative = "two.sided"
  )$p.value
}

df <- data.frame(
  Variable = names(dat[, 1:4]),
  raw_pvalue = round(raw_pvalue, 3)
)

df$Bonferroni <-
  p.adjust(df$raw_pvalue,
    method = "bonferroni"
  )
df$BH <-
  p.adjust(df$raw_pvalue,
    method = "BH"
  )
df$Holm <-
  p.adjust(df$raw_pvalue,
    method = "holm"
  )
df$Hochberg <-
  p.adjust(df$raw_pvalue,
    method = "hochberg"
  )
df$Hommel <-
  p.adjust(df$raw_pvalue,
    method = "hommel"
  )
df$BY <-
  round(p.adjust(df$raw_pvalue,
    method = "BY"
  ), 3)
df
4 if samples are independent,
raw_pvalue <- numeric(length = length(1:4))
for (i in (1:4)) {
  raw_pvalue[i] <- t.test(dat[, i] ~ dat$Species,
    paired = FALSE,
    alternative = "two.sided"
  )$p.value
}

df <- data.frame(
  Variable = names(dat[, 1:4]),
  raw_pvalue = round(raw_pvalue, 3)
)

df$Bonferroni <-
  p.adjust(df$raw_pvalue,
    method = "bonferroni"
  )
df$BH <-
  p.adjust(df$raw_pvalue,
    method = "BH"
  )
df$Holm <-
  p.adjust(df$raw_pvalue,
    method = "holm"
  )
df$Hochberg <-
  p.adjust(df$raw_pvalue,
    method = "hochberg"
  )
df$Hommel <-
  p.adjust(df$raw_pvalue,
    method = "hommel"
  )
df$BY <-
  round(p.adjust(df$raw_pvalue,
    method = "BY"
  ), 3)
df
5 if they are paired).

Based on these graphs, it is easy, even for non-experts, to interpret the results and conclude that the

raw_pvalue <- numeric(length = length(1:4))
for (i in (1:4)) {
  raw_pvalue[i] <- t.test(dat[, i] ~ dat$Species,
    paired = FALSE,
    alternative = "two.sided"
  )$p.value
}

df <- data.frame(
  Variable = names(dat[, 1:4]),
  raw_pvalue = round(raw_pvalue, 3)
)

df$Bonferroni <-
  p.adjust(df$raw_pvalue,
    method = "bonferroni"
  )
df$BH <-
  p.adjust(df$raw_pvalue,
    method = "BH"
  )
df$Holm <-
  p.adjust(df$raw_pvalue,
    method = "holm"
  )
df$Hochberg <-
  p.adjust(df$raw_pvalue,
    method = "hochberg"
  )
df$Hommel <-
  p.adjust(df$raw_pvalue,
    method = "hommel"
  )
df$BY <-
  round(p.adjust(df$raw_pvalue,
    method = "BY"
  ), 3)
df
6 and
raw_pvalue <- numeric(length = length(1:4))
for (i in (1:4)) {
  raw_pvalue[i] <- t.test(dat[, i] ~ dat$Species,
    paired = FALSE,
    alternative = "two.sided"
  )$p.value
}

df <- data.frame(
  Variable = names(dat[, 1:4]),
  raw_pvalue = round(raw_pvalue, 3)
)

df$Bonferroni <-
  p.adjust(df$raw_pvalue,
    method = "bonferroni"
  )
df$BH <-
  p.adjust(df$raw_pvalue,
    method = "BH"
  )
df$Holm <-
  p.adjust(df$raw_pvalue,
    method = "holm"
  )
df$Hochberg <-
  p.adjust(df$raw_pvalue,
    method = "hochberg"
  )
df$Hommel <-
  p.adjust(df$raw_pvalue,
    method = "hommel"
  )
df$BY <-
  round(p.adjust(df$raw_pvalue,
    method = "BY"
  ), 3)
df
7 species are significantly different in terms of all 4 variables (since all p-values \(< \frac{0.05}{4} = 0.0125\) (remind that the Bonferroni correction is applied to avoid the issue of multiple testing, so we divide the usual \(\alpha\) level by 4 because there are 4 t-tests)).

Additional p-value adjustment methods

If you would like to use another p-value adjustment method, you can use the

raw_pvalue <- numeric(length = length(1:4))
for (i in (1:4)) {
  raw_pvalue[i] <- t.test(dat[, i] ~ dat$Species,
    paired = FALSE,
    alternative = "two.sided"
  )$p.value
}

df <- data.frame(
  Variable = names(dat[, 1:4]),
  raw_pvalue = round(raw_pvalue, 3)
)

df$Bonferroni <-
  p.adjust(df$raw_pvalue,
    method = "bonferroni"
  )
df$BH <-
  p.adjust(df$raw_pvalue,
    method = "BH"
  )
df$Holm <-
  p.adjust(df$raw_pvalue,
    method = "holm"
  )
df$Hochberg <-
  p.adjust(df$raw_pvalue,
    method = "hochberg"
  )
df$Hommel <-
  p.adjust(df$raw_pvalue,
    method = "hommel"
  )
df$BY <-
  round(p.adjust(df$raw_pvalue,
    method = "BY"
  ), 3)
df
8 function. Below are the raw p-values found above, together with p-values derived from the main adjustment methods (presented in a dataframe):

raw_pvalue <- numeric(length = length(1:4))
for (i in (1:4)) {
  raw_pvalue[i] <- t.test(dat[, i] ~ dat$Species,
    paired = FALSE,
    alternative = "two.sided"
  )$p.value
}

df <- data.frame(
  Variable = names(dat[, 1:4]),
  raw_pvalue = round(raw_pvalue, 3)
)

df$Bonferroni <-
  p.adjust(df$raw_pvalue,
    method = "bonferroni"
  )
df$BH <-
  p.adjust(df$raw_pvalue,
    method = "BH"
  )
df$Holm <-
  p.adjust(df$raw_pvalue,
    method = "holm"
  )
df$Hochberg <-
  p.adjust(df$raw_pvalue,
    method = "hochberg"
  )
df$Hommel <-
  p.adjust(df$raw_pvalue,
    method = "hommel"
  )
df$BY <-
  round(p.adjust(df$raw_pvalue,
    method = "BY"
  ), 3)
df
##       Variable raw_pvalue Bonferroni    BH  Holm Hochberg Hommel    BY
## 1 Sepal.Length      0.000      0.000 0.000 0.000    0.000  0.000 0.000
## 2  Sepal.Width      0.002      0.008 0.002 0.002    0.002  0.002 0.004
## 3 Petal.Length      0.000      0.000 0.000 0.000    0.000  0.000 0.000
## 4  Petal.Width      0.000      0.000 0.000 0.000    0.000  0.000 0.000

Regardless of the p-value adjustment method, the two species are different for all 4 variables. Note that the adjustment method should be chosen before looking at the results to avoid choosing the method based on the results.

Below another function that allows to perform multiple Student’s t-tests or Wilcoxon tests at once and choose the p-value adjustment method. The function also allows to specify whether samples are paired or unpaired and whether the variances are assumed to be equal or not. (The code has been adapted from Mark White’s article.)

t_table <- function(data, dvs, iv,
                    var_equal = TRUE,
                    p_adj = "none",
                    alpha = 0.05,
                    paired = FALSE,
                    wilcoxon = FALSE) {
  if (!inherits(data, "data.frame")) {
    stop("data must be a data.frame")
  }

  if (!all(c(dvs, iv) %in% names(data))) {
    stop("at least one column given in dvs and iv are not in the data")
  }

  if (!all(sapply(data[, dvs], is.numeric))) {
    stop("all dvs must be numeric")
  }

  if (length(unique(na.omit(data[[iv]]))) != 2) {
    stop("independent variable must only have two unique values")
  }

  out <- lapply(dvs, function(x) {
    if (paired == FALSE & wilcoxon == FALSE) {
      tres <- t.test(data[[x]] ~ data[[iv]], var.equal = var_equal)
    } else if (paired == FALSE & wilcoxon == TRUE) {
      tres <- wilcox.test(data[[x]] ~ data[[iv]])
    } else if (paired == TRUE & wilcoxon == FALSE) {
      tres <- t.test(data[[x]] ~ data[[iv]],
        var.equal = var_equal,
        paired = TRUE
      )
    } else {
      tres <- wilcox.test(data[[x]] ~ data[[iv]],
        paired = TRUE
      )
    }

    c(
      p_value = tres$p.value
    )
  })

  out <- as.data.frame(do.call(rbind, out))
  out <- cbind(variable = dvs, out)
  names(out) <- gsub("[^0-9A-Za-z_]", "", names(out))

  out$p_value <- ifelse(out$p_value < 0.001,
    "<0.001",
    round(p.adjust(out$p_value, p_adj), 3)
  )
  out$conclusion <- ifelse(out$p_value < alpha,
    paste0("Reject H0 at ", alpha * 100, "%"),
    paste0("Do not reject H0 at ", alpha * 100, "%")
  )

  return(out)
}

Applied to our dataset, with no adjustment method for the p-values:

result <- t_table(
  data = dat,
  c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width"),
  "Species"
)

result
##       variable p_value      conclusion
## 1 Sepal.Length  <0.001 Reject H0 at 5%
## 2  Sepal.Width   0.002 Reject H0 at 5%
## 3 Petal.Length  <0.001 Reject H0 at 5%
## 4  Petal.Width  <0.001 Reject H0 at 5%

And with the Holm (1979) adjustment method:

result <- t_table(
  data = dat,
  c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width"),
  "Species",
  p_adj = "holm"
)

result
##       variable p_value      conclusion
## 1 Sepal.Length  <0.001 Reject H0 at 5%
## 2  Sepal.Width   0.002 Reject H0 at 5%
## 3 Petal.Length  <0.001 Reject H0 at 5%
## 4  Petal.Width  <0.001 Reject H0 at 5%

Again, with the Holm’s adjustment method, we conclude that, at the 5% significance level, the two species are significantly different from each other in terms of all 4 variables.

How many independent variables can you have in an ANOVA?

Revised on November 17, 2022. ANOVA, which stands for Analysis of Variance, is a statistical test used to analyze the difference between the means of more than two groups. A one-way ANOVA uses one independent variable, while a two-way ANOVA uses two independent variables.

Can you use ANOVA for 3 variables?

A three-way ANOVA tests which of three separate variables have an effect on an outcome, and the relationship between the three variables. It is also called a three-factor ANOVA, with ANOVA standing for "analysis of variance."

What type of test is conducting an ANOVA with three independent groups?

Typically, a one-way ANOVA is used when you have three or more categorical, independent groups, but it can be used for just two groups (but an independent-samples t-test is more commonly used for two groups).

How many independent variables are used in a 2x3 ANOVA?

ANOVA can have only one dependent variable. However, it can have more than one independent variable. It is given that it is a 2x3 factorial design which means there are two independent variables, one with two levels and one with three levels.