ML-learning-path

Self learning guide for machine learning

View the Project on GitHub elephantscale/ML-learning-path

Feature Engineering - Introduction

Back to Index

Objective

Learn feature engineering

Reference

Essential Reading

Extra Reading

Checklist

After completing the exercises below, you should be comfortable with

Exercises

Difficulty Level

★☆☆ - Easy
★★☆ - Medium
★★★ - Challenging
★★★★ - Bonus

A - Outlier Detection

A-1 : House Sales Data (★★☆)

Read the house-sales-simplified.csv.
We are going to find outliers in sale price.

First, describe and visualize saleprice attribute. How can we know there are outliers?
Hint: You can look at standard deviation

We can eliminate top 10% and bottom 10% to find middle prices.

As next step, we segment house prices per bedrooms.

We also need to take zipcode into account when determining prices.

So our final assesment, we need to calculate prices per-bedroom, per-zipcode.
Come up with your assesment of outier detection.

More Exercices


Index