In  this tutorial we will discuss about structure of Linear regression and how a Linear regression Equation is constructed for 2 variable model.

Please go through the Tutorial on Concept of Linearity to understand the basic requirement of linear regression viz Linearity.


 

Lets consider a very simple data where

Price  = f (Demand )

Price Demand Price Demand
1 48 3 44
1 49 4 35
1 50 4 38
1 51 4 42
2 44 5 36
2 45 5 39
2 46 5 40
2 47 6 32
2 48 6 35
3 40 6 37
3 42 6 36

Using excel scatter plot we plot the points and then add a linear Trendline which is nothing but a line using best fit linear regression equation.

 

 

Demand-regression

All the scatter plot points are the actual observation values of  demand given a Price. When we created the linear regression line with equation y = -2.852x +51.59 , we  essentially created a prediction of demand at each price point and all our predictions lie on the line represented by the equation. Please Note our regression equation is of the form

Ý = b1 + b2X 

For example our prediction for price 4 is 40, where as 3 of our observations for price 4 has actual demand as 35, 38 and 42 . This means that for every point which was observed, when we generated a prediction , we incurred error while generating the prediction.

For Sample –

Let us represent this error term by ei. Lets represent our Actual Demand as Yi for each i and our predicted demand for each i as Ýi . So we can represent our actual values as –

Yi = Ýi + ei                : This can be also written as

Yi = b1 + b2Xi + ei

Or

ei = Yi – b1 – b2 Xi                 ————————- ( I  )

 

Now this is based on limited finite sample so the key question is – Can we find b1 and b2 such that our overall error is minimized. The technique for doing this is called  Ordinary Least Squares (OLS)

So Here is what we want to do

Minimize  ∑ei²  =  ∑ ( Yi  –  Ýi )²  ———————( II )

Where : Yi = Actual Y value for ith item

Ýi = Predicted Y value for ith Item

Now we know from ( I ) above ei = Yi – b1 – b2 Xi   and ( II ) above

 

∑ei²  =   ( Yi – b1 – b2 Xi  )²           ——————–( III )

Hence ∑ei² = f(b1,b2)

So for given set of data different values of b1 and b2 will give rise to different ei values and thus a different ∑ei²

The OLS method is used to choose b1 and b2 in such a manner that we get a minimum ∑ei². OLS method uses differential calculus to get b1 and b2. Values of b1 and b2 that minimize        are obtained by solving the following two simultaneous equations :

∑Yi  = nb1 + b2 ∑Xi    and

∑YiXi  =  b1 ∑Xi  +  b2 ∑Xi ²

These are called least Squares Normal Equations. Solving these for b1 and b2 we get –

eqnb1b2

Properties of OLS Estimators b1 and b2

  1. Linearity – OLS Estimators are linear function of independent and dependent variables . i.e Y and X in our case.
  2. Unbiasedness – Average of the estimators is equal to true population parameter.
  3. Minimum variance – It has minimum variance in the class of all such linear unbiased estimators.

Assumptions for OLS

  1. The regression model is linear in Parameters
  2. ei s do not systematically affect Yi. i.e there is no pattern to ei and all ei s are random.
  3. The variance of ei for all observations for a given Xi are same.
  4. There is no auto-correlation between error terms .
  5. There is no correlation between e and the outcome variable Y
  6. Number of observations > Number of parameters to be estimated
  7. X (or Y) values must not all be the same ( Different X or Y are required )
  8. Input variables should not have linear relationships with each other ( Multi – Co-linearity)

Next in the series :

R Tutorial : Basic 2 variable Linear Regression

R Tutorial : Multiple Linear Regression

R Tutorial : Residual Analysis for Regression

R Tutorial : How to use Diagnostic Plots for Regression Models


Reference : Based on Lectures by Dr. Manish Sinha. ( Associate Prof. SCMHRD )

Advertisements

2 thoughts on “Tutorial : Linear Regression Construct

  1. There is quotation from Tagore ( i might have mentioned in the past ) which is relevant here. In these methods we acknowledge the presence or inevitability of error and think abt how to minimize the impact of error.  If you shut the door on error , truth , also, willbe shut out. … Ravindranath Tagore Or in popular terms one shpuls not throw out the bahy with the bath water. Concept of linearity is subtle y = m x + c is not linear function of x even though thee graph is a straight line! I have not read all the posts, but intend to do so. Bye Sudir

    Sent from Yahoo Mail on Android

    Liked by 1 person

    1. Agreeing completely here.
      A beautiful book called “Fooled by Randomness ” by Naseem taleb understcores the fact of inevitability of Randomness in life and how and why one should not try to find meaning from such randomness.

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s