Coefficient of Correlation

• Calculate the value of correlation coefficient and interpret it.

Pearsonâ€™s correlation coefficient between two variables is defined as the covariance of the two variables divided by the product of their standard deviations.
$r = \dfrac{\Sigma X Y}{n\sigma_x \sigma_y}$ or, $r = \dfrac{\Sigma X Y}{\sqrt{\Sigma X^2 \Sigma Y^2}}$ where,
X = deviation from mean, = x- $\bar{x}$
Y = deviation from mean, = y- $\bar{y}$
$\sigma_x$, $\sigma_y$ = standard deviation of x and y series
n = no. of values of the two variables.

The correlation coefficient ranges from -1 to 1.

$\large r^{2}= b_{xy}* b_{yx}$

• If the points are perfectly aligned in a straight line, then r = 1 (for positive slope)

• If the points are perfectly aligned in a straight line, then r = -1 (for negative slope)

• For random points which are scattered without any pattern, r = 0

where r = Karl Pearsonâ€™s coefficient of correlation

Solved Example:

10-2-01

In simple linear regression problem, r and b,

Solution:
In simple linear regression problem, the equation of the best-fit straight line is: $y = a + bx$
Here, b represents the slope of the straight line, which can have positive or negative value. If the slope is positive, the line will be 'uphill' and also the coefficient of correlation will be positive. The same is true is b is negative, at that time r will be negative. Hence, they must have same signs.

Solved Example:

10-2-02

If the correlation coefficient r = 1, then

Solved Example:

10-2-03

The strength of the linear relationship between two numerical variables may be assumed by the:

Solution:
The coefficient of correlation gives the strength of the linear relationship between two numerical variables.

Solved Example:

10-2-04

A regression model is used to express a variable Y as a function of another variable X.This implies that:

Solution:
If the scatter diagram indicates some relationship between two variables X and Y, then the dots of the scatter diagram will be concentrated round a curve. This curve is called the curve of regression. Regression analysis is used for estimating the unknown values of one variable corresponding to the known value of another variable.

Solved Example:

10-2-05

Calculate Pearson's coefficient for the following data:

Solution:

$\bar{x}$ = 4, $\bar{y}$ = 6

$\Sigma$ (x-$\bar{x}$)(y-$\bar{y}$)= 19,
$\Sigma$ (x-$\bar{x})^2$ = 14,
$\Sigma$ (y-$\bar{y})^2$ = 26

Let $\hat{X}$ = (x-$\bar{x})^2$ and $\hat{Y}$ = (y-$\bar{y})^2$

$r = \dfrac{19}{\sqrt{14} \sqrt{26}} = 0.9959$