귀무가설이 진실일 때 이를 기각하는 것(즉, 1종 오류)는 잘못된 결정이다.
이와는 반대로, 귀무가설이 허위일 때는 이를 기각하는 것(즉 연구가설이 옳을 때 이를 지지하는 것)이 바른 결정이다.
귀무가설이 허위일 때 이를 기각할 확률을 통계적 검증력(statistical power; 혹은 줄여서 검증력)이라고 한다.
검증력
검증력이 크면 귀무가설이 허위일 때 이를 기각하지 않는 오류 (2종 오류)는 작아진다.
그러므로 검증력과 2종 오류의 확률은 서로 반대로 작용한다. 2종 오류의 확률을 β로 나타내므로 검증력은 1-β 가 된다.
검증력을 결정짓는 것은 α, 표본의 크기, 그리고 효과크기이다.
즉, α가 클수록, 표본의 크기가 클수록, 그리고 효과크기가 클수록 검증력은 커진다.
이는 α가 클수록 귀무가설을 보다 쉽게 기각하며
표본의 크기가 클수록 그리고 효과크리가 클수록 검증통계량(절대값)이 커지기 때문이다.
효과크기
효과크기(effect size)는 연구되는 현상이 실제로 모집단에 존재하는 정도(the degree to which the phenomenon being studied exists in the population)을 말한다.
예를 들어, 두 집단 평균차이 검증의 경우 효과크기는 집단간 차이의 표준화 측정치(Cohen's d)로서 집단 평균값들 간의 차이를 표준편차로 나눈 것이다.
분산분석의 경우 η^2, 회귀분석의 경우 R^2 등은 각각의 경우 효과크기를 나타낸다.
In statistics, an effect size is a measure of the strength of a phenomenon[1] (for example, the relationship between two variables in a statistical population) or a sample-based estimate of that quantity.
An effect size calculated from data is a descriptive statistic that conveys the estimated magnitude of a relationship without making any statement about whether the apparent relationship in the data reflects a true relationship in the population.
In that way, effect sizes complement inferential statistics such asp-values. Among other uses, effect size measures play an important role in meta-analysis studies that summarize findings from a specific area of research, and in statistical power analyses.
The concept of effect size already appears in everyday language.
For example, a weight loss program may boast that it leads to an average weight loss of 30 pounds. In this case, 30 pounds is an indicator of the claimed effect size.
Another example is that a tutoring program may claim that it raises school performance by one letter grade. This grade increase is the claimed effect size of the program.
These are both examples of "absolute effect sizes", meaning that they convey the average difference between two groups without any discussion of the variability within the groups. For example, if the weight loss program results in an average loss of 30 pounds, it is possible that every participant loses exactly 30 pounds, or half the participants lose 60 pounds and half lose no weight at all.
Reporting effect sizes is considered good practice when presenting empirical research findings in many fields.[2][3] The reporting of effect sizes facilitates the interpretation of the substantive, as opposed to the statistical, significance of a research result.[4]Effect sizes are particularly prominent in social and medical research. Relative and absolute measures of effect size convey different information, and can be used complementarily. A prominent task force in the psychology research community expressed the following recommendation:
Always present effect sizes for primary outcomes...If the units of measurement are meaningful on a practical level (e.g., number of cigarettes smoked per day), then we usually prefer an unstandardized measure (regression coefficient or mean difference) to a standardized measure (r or d).
— L. Wilkinson and APA Task Force on Statistical Inference (1999, p. 599)
(출처 : http://en.wikipedia.org/wiki/Effect_size)
Cohen's d is an effect size used to indicate the standardised difference between two means. It can be used, for example, to accompany reporting of t-test and ANOVA results. It is also widely used in meta-analysis.
Cohen's d is an appropriate effect size for the comparison between two means. APA style strongly recommends use of ESs. Partial eta-squared covers how much variance in a DV is explained by an IV, but that IV possibly has multiple levels and hence partial eta-squared doesn't explain the size of difference between each of the pairwise mean differences.
Cohen's d can be calculated as the difference between the means divided by the pooled SD::
Cohen's d, etc. is not available in PASW, hence use a calculator such as those listed in external links.
In an ANOVA, you need to be clear about which two means you are interested in knowing about the size of difference between. This could most likely mean that you are interested in several d's, e.g., to compare marginal totals (for main effects) or cells (for interactions). In general, it is recommended to report all relevant Cohen's d values unless you've got a particular reason to just focus on a one or some of the possible values. From a descriptive statistics table, calculating Cohen's d is relatively straightforward.
Calculating Cohen's d provides useful information for discussion (e.g., allows ready comparison with meta-analyses and the size of effects reported in other studies). Where you are reporting about differences between two means, then a standardised mean effect size (such as d) would be an appropriate accompaniment to inferential testing.
(출처 : http://en.wikiversity.org/wiki/Cohen's_d)
'Books (Etc)' 카테고리의 다른 글
SPSS 20.0 매뉴얼 : 11장. 회귀분석 - 상관관계계수(0차 상관계수, 설명력, 편상관계수, 부분상관계수) (3) | 2013.07.12 |
---|---|
SPSS 20.0 매뉴얼 : 8장. 분산분석 II : 피실험자내 디자인 ANOVA와 삼원 ANOVA (구형성 가정, Sphericity assumption) (0) | 2013.07.03 |
SPSS 20.0 매뉴얼 : 13장. 신뢰성분석과 요인분석 (13.1 신뢰성분석의 개요) (0) | 2013.07.01 |
SPSS 20.0 매뉴얼 : 10장. 상관관계분석 (Pearson, Spearman, Partial Correlation) (0) | 2013.06.27 |
부모와 아이 사이 (Between Parent And Child) (0) | 2013.06.14 |