Artificial Intelligence for Big Data
上QQ阅读APP看书,第一时间看更新

Linear regression

With linear regression, we model the relationship between the dependent variable, y, and an explanatory variable or independent variable, x. When there is one independent variable, it is called simple linear regression, and in the case of multiple independent variables, the regression is called multiple linear regression. The predictor function in the case of linear regression is a straight line (refer to figure 4 for an illustration). The regression line defines the relationship between x and y. When the value of y increases when x increases, there is a positive relationship between x and y. Similarly, when x and y are inversely proportional, there is a negative relationship between x and y. The line should be plotted on x and y dimensions to minimize the difference between the predicted value and the actual value, called prediction error.

In its simplest form, the linear regression equation is:

This is the equation of a straight line, where y is the value of dependent variable, a is the y intercept (the value of y where the regression line meets the y axis), and b is the slope of the line. Let's consider the least square method in which we can derive the regression line with minimum prediction error.