In science, law, and life one often wants to know the relationship
between one thing and another.
In the 2 x 2 tables discussed previously, we were interested in, for
example, the relation between trial mode (judge or jury) and case outcome
(win or lose). The measure of relation for such nominal data is necessarily
limited by their categorical nature.
For quantitative data, measures of relation seem even more natural.
Do the number of hours that one studies help explain the grades obtained?
The term correlation is often used to denote some
form of association. Perhaps we think there is a correlation between
number of hours studied and grade point average (GPA). Or between SAT
scores (or high-school GPA) and college grades.
Correlation in statistics often refers to a measure of linear association
between two quantitative variables. A unit increase in one variable
increases or decreases by a fixed amount the other variable. For
example, 50 hours of additional study in a semester may, on average,
increase GPA by .1; or
a 50 point higher score on the SAT may correlate with a GPA in college
that is .1 higher.
Accordingly, a technique widely used is simple linear regression,
which estimates the best straight line to summarize the relation between
two quantitative variables. See a simple illustration.