
The Correlation Coefficient: Definition
Bruce Ratner, Ph.D. The correlation coefficient, denoted by r, is a measure of the strength of the straightline or linear relationship between two variables. The correlation coefficient takes on values ranging between +1 and 1. The following points are the accepted guidelines for interpreting the correlation coefficient:
The calculation of the correlation coefficient for two variables, say X and Y, is simple to understand. Let zX and zY be the standardized versions of X and Y, respectively. That is, zX and zY are both reexpressed to have means equal to zero, and standard deviations (std) equal to one. The reexpressions used to obtain the standardized scores are in equations (3.1) and (3.2):
zXi = [Xi  mean(X)]/std(X) (3.1) zYi = [Yi  mean(Y)]/std(Y) (3.2) The correlation coefficient is defined as the mean product of the paired standardized scores (zXi, zYi) as expressed in equation (3.3). rX,Y = sum of [zXi * zYi]/(n1), where n is the sample size (3.3) For a simple illustration of the calculation, consider the sample of five observations in Table 1. Columns zX and zY contain the standardized scores of X and Y, respectively. The last column is the product of the paired standardized scores. The sum of these scores is 1.83. The mean of these scores (using the adjusted divisor n1, not n) is 0.46. Thus, rX,Y = 0.46. ( Related Article: When Data Are Not Straight ) 1 800 DM STAT1, or email at br@dmstat1.com. 