Steyer, R., Smelser, N. J., & Jena, D. (2001). Classical (psychometric) test theory. International encyclopedia of the social and behavioral sciences. ogic of inquiry and research design, 1955-1962
Classical (Psychometric) Test Theory (CTT) aims at studying the reliability of a (realvalued)
test score variable (measurement, test) that maps a crucial aspect of qualitative
or quantitative observations into the set of real numbers. Aside from determining the
reliability of a test score variable itself CTT allows answering questions such as:
(a) How do two random variables correlate once the measurement error is filtered out
(correction for attenuation)?
(b) How dependable is a measurement in characterizing an attribute of an individual
unit, i.e., which is the confidence interval for the true score of that individual with
respect to the measurement considered?
(c) How reliable is an aggregated measurement consisting of the average (or sum) of
several measurements of the same unit or object (Spearman-Brown formula for
(d) How reliable is a difference, e.g., between a pretest and posttest?
Decomposition of the Variables Yi = τi + εi (1)
Decomposition of the Variances Var(Yi) = Var(τi) + Var(εi) (2)
Other Properties of True Score and Cov(τi, εj) = 0 (3)
Error Variables implied by their definition E(εi) = 0 (4)
E(εi |U ) = 0 (5)
for each (measurable) mapping of U: E[εi |f (U )] = 0 (6)
A normed parameter of unreliability is Var(εi) / Var(Yi), the proportion of
the variance of Yi due to measurement error. Its counterpart is 1 − Var(εi) / Var(Yi), i.e.,
Rel(Yi) := Var(τi) / Var(Yi), (1)
the reliability of Yi. This coefficient varies between zero and one. In fact, most theorems
and most empirical research deal with this coefficient of reliability. The reliability
coefficient is a convenient information about the dependability of the measurement in
one single number.
In early papers on CTT, reliability of a test has been defined by its correlation with itself
(see, e.g., Thurstone, 1931, p. 3). However, this definition is only metaphoric, because a
variable always correlates perfectly with itself. What is meant is to define reliability by
the correlation of “parallel tests” (see below). The assumptions defining parallel tests in
fact imply that the correlation between two test score variables is the reliability.
Annem ve Ben
Assumptions of CTT
The assumptions (a1) to (a3) specify in different ways the assumption that two tests Yi
and Yj measure the same attribute. Such an assumption is crucial for inferring the degree
of reliability from the discrepancy between two measurements of the same attribute of
the same person. Perfect identity or “τ-equivalence” of the two true score variables is
assumed with (a1). With (a2) this assumption is relaxed: the two true score variable may
differ by an additive constant. Two balances, for instance, will follow this assumption if
at least one of them yields a weight that is always one pound larger than the weight
indicated by the other balance, irrespective of the object to be weighed. According to
Assumption (a3), the two tests measure the same attribute in the sense that their true
score variables are linear functions of each other.
The other two assumptions deal with properties of the measurement errors. With (b) one
assumes measurement errors pertaining to different test score variables to be uncorrelated.
In (c) equal error variances are assumed, i.e., these tests are assumed to
measure equally well.
Assumptions used to define some models of CTT
(a1) τ-equivalence τi = τj,
(a2) essential τ-equivalence τi = τj + λij , λij ∈ IR,
(a3) τ-congenerity τi = λij0 + λij1 τj , λij0, λij1 ∈ IR, λij1 > 0
(b) uncorrelated errors Cov(εi, εj) = 0, i ≠ j
(c) equal error variances Var(εi) = Var(εj).
Models defined by combining these assumptions
Parallel tests are defined by Assumptions (a1), (b) and (c).
Essentially τ-equivalent tests are defined by Assumptions (a2) and (b).
Congeneric tests are defined by Assumptions (a3) and (b).
Note: The equations refer to each pair of tests Yi and Yj of a set of tests Y1, ..., Ym, their
true score variables, and their error variables, respectively.