Self learning guide for machine learning
Learn clustering with k-means
★☆☆ - Easy
★★☆ - Medium
★★★ - Challenging
★★★★ - Bonus
Use Scikit’s make_blobs to generate some data
Cluster it using Kmeans
Start with this notebook: kmeans-1-intro
We are going cluster cars dataset.
Here is the cars data set
Data looks likes this:
model mpg cyl disp hp drat wt qsec vs am gear
Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5
Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4
Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4
Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4
Only use mpg
and cyl
columns and cluster the cars.
You can start with this notebook: kmeans-2-mtcars
This is a fun lab. We will cluster Uber pick up locations and figure out where the demand hot-spot is.
Here is uber dataset
You can start with this notebook: kmeans-3-uber-pickups