Tip:
Highlight text to annotate it
X
Hi and welcome back. We're on Segment two of Lecture fourteen
on Analysis of Variance. And, in this segment, we will step through
a one-way between groups analysis variance.
The example I'm going to use for this segment takes us back to the first week of
this, of this course where I talked about training your working memory to enhance
intelligence. And if you remember, this paper was
published by Jaeggi et al. in 2008, Where the independent variable was the
number of training sessions that each individual subject performed.
So again, to be clear, subjects came to this lab at the University of Michigan and
the experimenters randomly assigned them to one of four training conditions.
That is eight days of training, twelve days of training, seventeen or nineteen
days of training. The dependent variable, there is a couple
ways to look at it. One,, is they had subjects perform a, an
intelligence test before and after training.
So, just for ease of demonstration and so we can do one way between ANOVA what I am
doing here is using the gain score as the dependent variable.
So, just, there's score in the intelligence test after training minus
test score on the intelligence test before training. So, just pause minus pre.
This is what the data frame would look like if we were working in R.
So, just to be clear. We would have subject number in the first
column, then what condition they were in, or they were in how many, how many
training sessions were they in, their pre-score, their post-score, and their
gain score which is just the difference. In this lecture, what I'd like to do is
actually do a one-way ANOVA by hand. So,
I'm going to set up the data this way. This is not how would it look if you were
working in any statistics software package or and it's not how would it look in R.
And you'll see that in an upcoming lecture where we do analysis of variance in R.
But just to show you the data. Again I just made up data to be consistent
with their results. I'm, I'm presenting it this way.
This is easier for doing hand calculations, if you will.
So, let's just assume there were five subjects in each of the four conditions.
So, twenty subjects total came to the lab, we randomly assigned them to one of these
four conditions, and these are their gain scores.
So, if you just eyeball it you'll see that people didn't gain much if they only did
eight days of training, they gained a little more if they did twelve, and so on.
So, we could do a one-way analysis of variance to analyze these data to ask the
question, is there a significant difference across those four groups of
subjects? That is, does training enhance
intelligence? And that's, that's the hypothesis under
investigation. The way to do that is to calculate an F
ratio. And the F ratio, you can think about it a
few different ways. One way, I really like to conceptually
think about it is just systematic variance relative to unsystematic variance.
Remember, the T test was a deviation score, that is, the difference between two
means, so, like, mean one minus mean two over standard error of the difference.
That was how much of a difference did we observe relative to how much of a
difference did we expect due to chance? The F ratio is exactly the same.
It's, I should say, analagous to in the same way.
Except now, instead of dealing with deviation scores, we're dealing with
variance. So, just square everything.
Now, we have variance over variance and it's just systematic variance.
How much variance did we create with the introduction of our, our independent
variable relative to just how much variance do we observe just due to chance?
And the really cool thing about analysis variance is that is easily calculated by
just looking at the between groups variance relative to the within groups
variance. So, I might rewrite that as mean squares
between over mean squares within. But the notations scheme I prefer and this
comes from Keppel and Wickens textbook which is an excellent textbook.
It's mean squares sub A over mean squares S within A. The read, the way to read that
last term is subjects within groups. So, A refers to groups.
S refers to subjects. So, the F ratio, I can write as MS of A
over MS, S with A. And then, mean squares, again, is always
just sums of squares over degrees of freedom.
The trick is calculating the sums of the squares.
So, here's the sums of squares between groups.
What we want to know is: how much are the group means varying from each other?
The way to do that is to take each individual group mean and compare it to
the grand mean, or the overall mean from the experiment.
So, I'm going to take each individual group mean and compare it to the grand
mean. I'll get those deviation scores, I'll
square them, I'll sum them, sums of squares.
And then, divided by the number of groups minus one, and that will give me mean
squares. I also need to pre-multiply by the number
of subjects in each group. The way to think about this is each
individual subject is contributing a deviation score to both the between groups
sums of squares, and the within groups sums of squares.
So how to calculate the within group sum of squares?
I just take each individual subject within a condition and compare them to their
group mean. That's error bearing.
That's our unsystematic variance. Why are people in the same condition, the
people who are treated exactly the same, they went through the same exact amount of
training. Why are they differing from each other?
I don't know, that's unsystematic, that's just chance error.
So, our estimate of the average amount of chance error like standard error in the T
test is calculated this way just by working at each individual, how much they
differ from the other individuals in their same condition of work.
Degrees of freedom is always the number of scores that go into that variance term
minus one. So, for,
For A, df sub A, it's just the number of levels of the independent variable A minus
one. In this case, four minus one, or three.
Degrees of freedom S with an A is how many subjects are in each condition or each
group minus one times the number of conditions.
And df total is the sum of dfA and dfS within A. And it's the total number of
subjects minus one. What's common in an analysis of variance
is to summarize all of those numbers in an ANOVA summary table,
Where you list the source of the variance, the between groups, the within groups, and
total. And here, I've just summarized all the
formulas we just looked at. So, to make these more clear we're going
to walk through this example and actually do all the calculations by hand.
Again, I've, I've been teaching this course a long time and this is one place
where students tell me it really helps conceptually to walk through and do an
ANOVA by hand just once. Again, we'll just, we'll do this once and
then, you know, in an upcoming lecture, we'll just, we'll do it in R.
Okay. So, again, to go back to the summary
table, Let's start here with sums of squares
between groups, Then, we'll do sums of squares within
groups. That's the hardest part.
Once we've got that, it's all just simple algebra to get the degrees of freedom, to
get the mean squares, To get the F ratio.
Okay, so here's sums of squares between groups.
And again, I just want to take each group mean and compare it to the grand mean.
So, there are four groups. The mean of the first group was one, the
mean of the second group was two, the mean of the third group four, the mean of the
fourth group five. The overall mean was three.
Three. Three, three, three.
So, what I'm doing is getting each deviation score, but I'm comparing groups
to grand mean. I square all those.
I sum them, that's my sums of squares and remember, I have to premultiply by a
little n where little n is the number of subjects within a group.
Cuz again, each subject is contributing a deviation score.
So, the sum in here is ten. Premultiply that by five,
I get 50. Here's the sums of squares within groups.
And again, the group means were one, two, four, and five.
These scores are the individual subject scores within a group.
So, to get, to do the first group. That's the top row.
I calculate there within group sums of squares, by just taking each individual
subject score comparing it to their group mean.
Do it for the second group, third, group, fourth group.
So, each row represents a group. Sum that, and I get 26 degrees of freedom.
As I said, real easy. It's just four minus one for the between
groups. Little n minus one, because there are five
people in each group minus one, four times four groups is sixteen.
Mean squares is just sums of squares over degrees, over freedom.
And the F is the ratio of the means squared.
So, it's sixteen over 1.6, we get an F ratio of ten.
It's a big F ratio, I don't know, I don't even need to look at degrees of freedom to
know that that's significant. But remember, the F ratio, the F test, has
a family of F, F distributions that we need to look at to get the right P value.
Just like the T test has a family of T distributions.
It depends on the number of groups and the number of subjects within a group.
But, in this case, the P value is really low, it's 0.0005. And I will reject in all
hypotheses. Again, where exactly does that P value
come from? It comes from the sampling distributions.
And, the difference between these curves is, just how many groups do we have and
how many subjects do we have? This is the family of f-distributions,
like family of T distributions. And, that's when I said, you know, f, F is
ten, I know it's significant, because look out here at the end of this graph, this is
an F of two right here. So, ten, you know, is way out here, I know
that the percentage or area under the curve is really low, it is 0.001, right?
so, it's significant. We reject the null hypothesis.
The difference between those groups is statistically significant.
Of course, there's a few more things we want to do.
Just like when we did a T test, we want to calculate effects size.
And there's an extra step here is we need to do what's called post-talk-test.
So, we know that the manipulation of, of number of training sessions was
significant. But we don't know was there a significant
difference between the first group and the second group, the second and third, and so
on. So, we'll calculate effect size and we
also need to do post-talk-tests. So, the way to calculate effect size in
analysis of variance is by calculationg eight of squared.
So, that's this guy right here is eight of squared.
And it's the same exact statistic as R squared from multiple regression.
It's just the percentage of variance explained in the outcome variable or now,
I'm going to use the phrase dependent variable.
So, it's just, what's the sums of squares associated with our treatment?
A divided by the total sums of squares. In our case, it's really large, and this
is unusual. Again, I just made up the data.
We only had five subjects per condition. I made the between groups variance really
big. So we had a really large eta-squared,2,
it's 0.66. That's, you don't typically see that in real research.
But that, that's our eta-squared in this example.
There are several assumptions we make under, when we do an analysis of variance.
One, again, we are still dealing with an outcome variable or dependent variable
that's continuous and has a normal distribution, or relatively normal.
And there's a new assumption in here called the homogeneity of variance
assumption. And the idea behind that is, we're
assuming that within each group, there's an equivalent amount of just unsystematic
variance. And that's a really important assumption
because if I go back to the calculation of mean squares S with an A or sums of square
S with an A. What I've done here is I've just put all
of this together. So, the first row is the first group.
Second row is second, and so on. And then we just sum that all up and we
divide it by the number of groups. So, we're averaging across four groups.
We're making the assumption there that each group is contributing sort of an
equal amount to that area of variance. If they're not, then that's not
representative of the four groups, right? Let's go back to like, descriptive
statistics and our ideas of representative measures of central tendency.
So, if that's not the case, then I can't assume that I have homogeneity of variance
and I shouldn't be doing a one-way ANOVA. So, in the, in the lecture where we do
analysis of variance in R, I'll show you how to test this assumption by doing
what's called Levene's test. And if we don't have homogeneity of
variance, if we violate that assumption, then we'll just conduct maybe just simple
T test using what's called a restricted error term, restricted to the groups that
are under comparison. So, that's everything I just said there
ahm and we'll see more, more about that in Lecture fifteen.