Linear Regression Intro
Back to Index
Objective
Learn Linear Regression
Essential Reading
Basics of Linear Regression
Understanding Errors and Residuals
Implementing Linear Regression in Scikit-Learn
Checklist
- What do the following mean?
- coefficients
- slope
- intercept
- Understand different error metrics
- Sum of Squared Error (SSE)
- Mean square error (MSE)
- Mean absolute error (MAE)
- Root mean squared error (RMSE)
- Regression evaluation metrics
- r (Person’s coefficient)
- r2 (Coefficient of determination)
- Generating synthetic dataset for regression (sklearn.datasets.make_regression)
- Familiar with
sklearn.linear_model.LinearRegression
Exercises
Difficulty Level
★☆☆ - Easy
★★☆ - Medium
★★★ - Challenging
★★★★ - Bonus
Ex-1 - Practice regression with synthetic data (★☆☆)
Here we will use Scikit’s make_regression
to generate some data and fit linear regression.
Start with this notebook : lr-1__intro
And work through it
Ex-2 - Billing and tipping data (★☆☆)
Start with this notebook: lr-2__tips
Complete the TODO items
Ex-3 - House sales data (★★☆)
- Data: house-sales-full.csv
- Label is : SalePrice
- Input features : Start with
Bedrooms
, Bathrooms
, SqFtTotLiving
, SqFtLot
- Run linear regression
- What is the R2 you are getting?
- Can you reason why the R2 is low?
- Add more features to input and see if you can improve the score
- What is the maximum R2 you can attain? With what features?