Self learning guide for machine learning
Learn Pandas
After completing the exercises below, you should be comfortable with
★☆☆  - Easy
★★☆  - Medium
★★★  - Challenging
★★★★ - Bonus
A1 - Import Pandas and print out the current version (★☆☆)
A2 - Create an empty dataframe (★☆☆)
A3 - Create a dataframe from a dictionary
Create a dictionary like this first.  And then convert it into a dataframe
d = {'a': [1, 2],
     'b': [3, 4] }
A4 - Create the following dataframe (★☆☆)
Expected output:
            city  population  rainfall
0       San Jose          10      15.5
1  San Francisco           5      10.2
2    Los Angeles          30       5.5
3        Seattle           7      50.5
A5 - Print the shape of above dataframe (★☆☆)
Hint : pd.shape
B1 - In the above dataframe, print out population column (★☆☆)
Expected output:
0    10
1     5
2    30
3     7
B2 - Print out column 2 
Expected output:
0    15.5
1    10.2
2     5.5
3    50.5
B3 - In the previous example, just print out the city names, without index (★☆☆)
Hint : values
Expected output:   ['San Jose', 'San Francisco', 'Los Angeles', 'Seattle']
B4 - Print out the row for San Francisco (★☆☆)
Hint: iloc
Expected output:
city          San Francisco
population                5
rainfall               10.2
B5 - Print out rainfall number for Settle (★☆☆)
C1 - Set population  for San Francisco  to 12 (★☆☆)
Hint: iloc
Expected output:
            city  population  rainfall
0       San Jose          10      15.5
1  San Francisco          12      10.2
2    Los Angeles          30       5.5
3        Seattle           7      50.5
C2 - Add a new row as follows (★☆☆)
city: 'San Diego', population: 8, rainfail: 7.5
Hint : pd.append
Expected output:
            city  population  rainfall
0       San Jose          10      15.5
1  San Francisco          12      10.2
2    Los Angeles          30       5.5
3        Seattle           7      50.5
0      San Diego           8       7.5
D1 - Print rows where population > 10 (★★☆)
Hint : cities['population'] > 10  and pd.loc
Expected output:
            city  population  rainfall
1  San Francisco          12      10.2
2    Los Angeles          30       5.5
D2 - Print rows where population > 10 and rainfal < 10 (★★☆)
Expected output:
            city  population  rainfall
2    Los Angeles          30       5.5
E1 - Read the csv files (★☆☆)
Read the house-sales-sample.csv.
Hint: pd.read_csv
E2 - Print out the column types of the above dataframe (★☆☆)
Hint: df.dtypes
E3 - Print out the information about the above dataframe.  Note the datatypes and memory usage (★☆☆)
Hint: df.info()
E4 - And how many sales for ‘Bedrooms = 4’ (★☆☆)
Hint: query or df indexing or size