Understanding Linear Regression

Among various kinds of Linear Models used in Machine Learning, Linear Regression is one of the popular and basic models that every Data Scientists or Machine Learning Engineers (or some other names) read about. Before directly diving to Linear Regression let’s understand what exactly is the linear model.

Linear Model

In simple words, we define “a linear model to have a linear relationship between the independent variable/s and the dependent variable. Mathematically, a linear model is an equation that describes the relationship between two quantities that shows a constant rate of change. Graphically, a linear relationship is represented as a straight line as shown in the figure”.

Now, we have a flavor of a linear model, let’s get back to Linear Regression.

Linear Regression

It is a statistical tool used to find the linear relationship between a dependent variable and one or more independent variable/s. When there is one independent variable, it is said to be Simple Linear Regression and when there are two or more independent variables are available, it is known as Multiple Linear Regression. This model is a Supervised Learning Model (here the data is labeled).

Mathematically, the Linear Regression equation is given as:

Where Y_iis the dependent variable, X_iis independent variable, β₀ is the intercept, β₁ is the slope, and Ε_iis the random error.

IDEA: The main idea here is to find a line that best fits the data. The best fit line is the one for which total prediction error (for all the data points) is as small as possible.

Consider the below-fitted line to the given data points.

Error is nothing but the distance between the actual point and the fitted line. Mathematically,

Note: Squared because if we don’t square the error, the point giving positive error and the point giving negative error may cancel each other.

Least Square Estimation

The parameters β₀ and β₁ are unknown and are estimated using the sample data. We estimate β₀ and β₁ so that the sum of squares of all the difference between the observation Y_i and the fitted line is minimum i.e. the error is minimum.

The least-square estimation of β₀ and β₁ (i.e. \widehat{\beta _{0}} and \widehat{\beta _{1}}) must satisfy the following two equations:

The partial derivative of the error with respect to \beta _{0} should be zero.

2. The partial derivative of the error with respect to \beta _{1} should be zero.

Both of the above equations are normal equations. There are two parameters \beta _{0} and \beta _{1}, so we have two equations, if we ‘k’ such parameters we would have got ‘k’ normal equations.

Solving equation 1 and equation 2 we get,

Now, putting the values of \widehat{\beta _{0}} and \widehat{\beta _{1}} in the original equation we get the fitted line to the given data.

Important points about parameters

If \widehat{\beta _{1}} > 0, X and Y have a positive relationship. Increase in X will increase Y.
If \widehat{\beta _{1}} < 0, X and Y have a negative relationship. Increase in X will decrease Y.

Important points about Linear Regression

To build a Linear Regression model, there must be a linear relationship between independent and dependent variables.
Linear Regression is very sensitive to outliers.

Aticleworld

aticleworld offer c tutorial,c programming,c courses, c++, python, c#, microcontroller,embedded c, embedded system, win32 API, windows and linux driver, tips

Understanding Linear Regression

Linear Model

Linear Regression

Least Square Estimation

Recommended Post

About

Leave a Reply Cancel reply

Linear Model

Linear Regression

Least Square Estimation

Recommended Post

About

You might also like

Importance of Data Collection in Research and Best Data Collection Tools

Step-by-Step Guide to Becoming a Data Scientist:- A Brief Guide

Leave a Reply Cancel reply