#### Steyer, R., Smelser, N. J., & Jena, D. (2001). Classical (psychometric) test theory. International encyclopedia of the social and behavioral sciences. ogic of inquiry and research design, 1955-1962

CTT

Classical (Psychometric) Test Theory (CTT) aims at studying the reliability of a (realvalued)

test score variable (measurement, test) that maps a crucial aspect of qualitative

or quantitative observations into the set of real numbers. Aside from determining the

reliability of a test score variable itself CTT allows answering questions such as:

(a) How do two random variables correlate once the measurement error is filtered out

(correction for attenuation)?

(b) How dependable is a measurement in characterizing an attribute of an individual

unit, i.e., which is the confidence interval for the true score of that individual with

respect to the measurement considered?

(c) How reliable is an aggregated measurement consisting of the average (or sum) of

several measurements of the same unit or object (Spearman-Brown formula for

test length)?

(d) How reliable is a difference, e.g., between a pretest and posttest?

Decomposition of the Variables Yi = τi + εi (1)

Decomposition of the Variances Var(Yi) = Var(τi) + Var(εi) (2)

Other Properties of True Score and Cov(τi, εj) = 0 (3)

Error Variables implied by their definition E(εi) = 0 (4)

E(εi |U ) = 0 (5)

for each (measurable) mapping of U: E[εi |f (U )] = 0 (6)

A normed parameter of unreliability is Var(εi) / Var(Yi), the proportion of

the variance of Yi due to measurement error. Its counterpart is 1 − Var(εi) / Var(Yi), i.e.,

Rel(Yi) := Var(τi) / Var(Yi), (1)

the reliability of Yi. This coefficient varies between zero and one. In fact, most theorems

and most empirical research deal with this coefficient of reliability. The reliability

coefficient is a convenient information about the dependability of the measurement in

one single number.

In early papers on CTT, reliability of a test has been defined by its correlation with itself

(see, e.g., Thurstone, 1931, p. 3). However, this definition is only metaphoric, because a

variable always correlates perfectly with itself. What is meant is to define reliability by

the correlation of “parallel tests” (see below). The assumptions defining parallel tests in

fact imply that the correlation between two test score variables is the reliability.

#### Annem ve Ben

Assumptions of CTT

The assumptions (a1) to (a3) specify in different ways the assumption that two tests Yi

and Yj measure the same attribute. Such an assumption is crucial for inferring the degree

of reliability from the discrepancy between two measurements of the same attribute of

the same person. Perfect identity or “τ-equivalence” of the two true score variables is

assumed with (a1). With (a2) this assumption is relaxed: the two true score variable may

differ by an additive constant. Two balances, for instance, will follow this assumption if

at least one of them yields a weight that is always one pound larger than the weight

indicated by the other balance, irrespective of the object to be weighed. According to

Assumption (a3), the two tests measure the same attribute in the sense that their true

score variables are linear functions of each other.

The other two assumptions deal with properties of the measurement errors. With (b) one

assumes measurement errors pertaining to different test score variables to be uncorrelated.

In (c) equal error variances are assumed, i.e., these tests are assumed to

measure equally well.

Assumptions used to define some models of CTT

(a1) τ-equivalence τi = τj,

(a2) essential τ-equivalence τi = τj + λij , λij ∈ IR,

(a3) τ-congenerity τi = λij0 + λij1 τj , λij0, λij1 ∈ IR, λij1 > 0

(b) uncorrelated errors Cov(εi, εj) = 0, i ≠ j

(c) equal error variances Var(εi) = Var(εj).

Models defined by combining these assumptions

Parallel tests are defined by Assumptions (a1), (b) and (c).

Essentially τ-equivalent tests are defined by Assumptions (a2) and (b).

Congeneric tests are defined by Assumptions (a3) and (b).

Note: The equations refer to each pair of tests Yi and Yj of a set of tests Y1, ..., Ym, their

true score variables, and their error variables, respectively.