**Scatter plots** are useful when we want to determine whether there is a relationship between two quantitative variables. The two quantitative variables are referred to as **bivariant data**. This is usually written in the form (x_{i} , y_{i} ) where i = 1, 2, 3, …, n. Hence X is our independent variable and y is our dependent variable.

Our bivariant data can be represented **graphically** with our scatter plot. Additionally, this can be represented **numerically** using the following numerical summaries:

- The
**mean**of the**x**and mean of the**y**data sets will give us ourof the data.*centre* - The
will be given by the**horizontal spread****standard deviation**of the x data set where most data will fall within 2 SD either way of the mean. - The
will likewise be given by the*vertical spread***standard deviation**of the y data set where most data will fall within 2 SD either way of the mean. - The
about the line provides us with the final piece of the puzzle – this is known as the*clustering***correlation coefficient, r.**This will tell us how close the data is to a linear line.

The correlation coefficient ranges from -1 to 1, where positive value means one variable increases, the other variable increases. Where as, a negative value means one variable decreasing correlates with the other variable increasing. The more positive or negative the value the stronger the trend with a zero meaning there is no noticeable trend.

To model our data we can use the SD line, however, it does not use the r value hence it has its limitations. To find the SD line, the point for the mean of x and the mean of y is as well as the point for the mean of X + the standard deviation of X and the mean of Y + the standard deviation of Y. This model does not account for the clustering about the line resulting in it over estimating the RHS and underestimating the data on the LHS.

( x¯ ,y¯) to (x¯ + SDx, y¯ + SDy)

A better way of modelling the data is to use the regression line. This accounts for the R value:

(x¯, y¯) to (x¯ + SDx, y¯ + rSDy)

This straight line is in the form y = mx + b, however, you can only find y with x value and can not do this in reverse.