

5 months into learning SPSS / Statistics and need a hopefully quick primer on one statistic from ANOVA.
I am duplicating a report from last year, which worked great for this year's data. SPSS ran great and I got the output I expected, but need help in interpreting last year's report.
ANOVA output from last year is
Fall CALC1 grades
N Mean Std. Deviation
Class of 2020 125 2.0720 1.01740
Class of 2021 149 2.6644 .96989
Total 274 2.3942 1.03320
Fall CALC1 grades
SS df Mean Square F Sig.
Between Groups 23.857 1 23.857 24.252 .000 ***
Within Groups 267.573 272 .984
Total 291.431 273
*** The footnote on this data says "There is a statistically significant and moderately sized (d = .60) difference in Calculus I grades between 2020 and 2021. "
My question is where do you get "d = .60" (because I don't see it anywhere). If I can find that, at least I'll have a reference for the comment, then I'll study ANOVA myself and post questions as needed.
(The statistician who did last year's report is nowhere to be found ...)
It must be Cohen's d effect size. See Wikipedia.
I would bet that it is Cohen's d... The difference between the means is approximately .59. The pooled standard deviation is a little less than 1.00, so the ratio is approximately .6 or so..
Bill
William B. Ware, Professor Emeritus
Educational Psychology, Measurement, and Evaluation
Learning Sciences and Psychological Studies
University of North Carolina at Chapel Hill
McMichael Term Professor of Education, 20112013
Adjunct Professor, School of Social Work
Academy of Distinguished Teaching Scholars at UNCChapel Hill, Charter Member
EMAIL: [hidden email]
He probably calculated it by hand. SPSS produces the partial eta squared as the measure of effect size, if the option is checked when preparing the analysis.
Note that, in general, there are two types of effect size measures: (1) Percentage of variance accounted for (e.g., etasquare, Rsquare, etc; historically, SPSS has peovided only this type of effect size measure). and
(2) Difference(s) between means (e.g., d, g, f, etc.)
Because of the critical role effect size measures play in statistical power analysis, one nice reference for effect size measures and power analysis is Jack Cohen's 1992 article "A Power Primer"; see:
Note2: For a oneway independent groups ANOVA, Cohen recommends the effect size measure "f". However, in your example you only have two groups (which could have been analyzed by independent groups ttest), though f can be calculated, d is a simpler and more familiar measure. Cohen's Table 1 provides a listing of effect size measures, their formulas, and guidelines for the interpretation of the magnitude of an ES.
HTH.
Mike Palij New York University
My bad. When I go to the address I gave I have automatic access to the article (I guess I get it because I come from the nyu.edu domain but I don't remember acess being so seamless). The full reference for the Cohen article is:
Cohen, J. (1992). A power primer. Psychological Bulletin, 112(1), 155159.
doi:10.1037/00332909.112.1.155
A Google search may turn up an unlocked copy (it is a popular article with over 32K citations according to Google Scholar).
Mike Palij New York University
But before you use d (or some other standardized effect size measure) as the
basis of a sample size estimate, see this short note by Russell Lenth:
https://homepage.divms.uiowa.edu/~rlenth/Power/2badHabits.pdfIf you can get access to it, take a look at Thom Baguley's nice article too:
https://onlinelibrary.wiley.com/doi/full/10.1348/000712608X377117HTH.
But before you use d (or some other standardized effect size measure) as the
basis of a sample size estimate, see this short note by Russell Lenth:
https://homepage.divms.uiowa.edu/~rlenth/Power/2badHabits.pdf
If you can get access to it, take a look at Thom Baguley's nice article too:
https://onlinelibrary.wiley.com/doi/full/10.1348/000712608X377117
HTH.
The Baguley article is excellent, but it presents the argument as either or. Of course one needs BOTH
Of course one MUST have the RAW effects, aka DESCRIPTIVE statistics, together with confidence levels (which are inferential).
Arguments depend to be about how INFERENTIAL statistics are presented.
Statistical effect sizes measuring the
magnitude of the effect of the predictor relative to everything else [random and confounding factors]. This as others have noted may be either of the form of a magnitude divided by SD of that magnitude. E.g. Cohen’s d for difference between means or other
parameter, Hedge’s g etc. Or they may be some kind of proportion of variance accounted for. This does not depend on sample size, which aids in interpretation.
Probability of a null hypothesis measuring the probability that the effect obtained from the sample could have occurred by chance. This depends on sample size and sometimes there is strong evidence [low pnull] because smack,e
size is large.
BOTH ar important for interpretation. They are different ways of presenting the results of the same inferential test, e.g. an Ftest. They are both valuable and ditching pnull just because effects sizes are a
good thing is in my view insane. Small but reliable effect sizes may be very important in large populations. Large but unreliable effect sizes are only useful in suggesting replications  not in drawing inferences.
See
for conversion between inferential measures.
Bayes Factor [BF] and associated credibility intervals.
Are also valuable [to frequentists as well as Bayesians] because they provide evidence for the null swell as the alternative hypothesis. Exact values of BF require specialists software. However
estimates can be obtained form Fvalues see
Wetzels, R., Matzke, D., Lee, M. D., Rouder, J. N., Iverson, G. J., & Wagenmakers, E.J. (2011). Statistical Evidence in Experimental Psychology.
Perspectives on Psychological Science, 6(3), 291298. doi:10.1177/1745691611406923
Best
Diana
