**Scatter plots** are useful when we want to determine whether there is a relationship between two quantitative variables. The two quantitative variables are referred to as **bivariant data**. This is usually written in the form (x_{i} , y_{i} ) where i = 1, 2, 3, …, n. Hence X is our independent variable and y is our dependent variable.

Our bivariant data can be represented **graphically** with our scatter plot. Additionally, this can be represented **numerically** using the following numerical summaries:

- The
**mean **of the **x **and mean of the **y **data sets will give us our *centre *of the data. - The
**horizontal spread** will be given by the **standard deviation** of the x data set where most data will fall within 2 SD either way of the mean. - The
*vertical spread* will likewise be given by the **standard deviation** of the y data set where most data will fall within 2 SD either way of the mean. - The
*clustering* about the line provides us with the final piece of the puzzle – this is known as the **correlation coefficient, r.** This will tell us how close the data is to a linear line.

The correlation coefficient ranges from -1 to 1, where positive value means one variable increases, the other variable increases. Where as, a negative value means one variable decreasing correlates with the other variable increasing. The more positive or negative the value the stronger the trend with a zero meaning there is no noticeable trend.

To model our data we can use the SD line, however, it does not use the r value hence it has its limitations. To find the SD line, the point for the mean of x and the mean of y is as well as the point for the mean of X + the standard deviation of X and the mean of Y + the standard deviation of Y. This model does not account for the clustering about the line resulting in it over estimating the RHS and underestimating the data on the LHS.

( x¯ ,y¯) to (x¯ + SDx, y¯ + SDy)

A better way of modelling the data is to use the regression line. This accounts for the R value:

(x¯, y¯) to (x¯ + SDx, y¯ + rSDy)

This straight line is in the form y = mx + b, however, you can only find y with x value and can not do this in reverse.

### Like this:

Like Loading...

*Related*