Inspecting DataFrames

Pandas provides some simple methods to look at your dataframes:

  • [your_dataframe_name].head(5) will provide the first 5 rows

  • [your_dataframe_name].tail(10) will provide the last 10 rows

  • [your_dataframe_name].describe() is a quick way to get summary statistics on a per-column basis

You can find more useful pandas functions [here]

ms.head() #this will give the first 5 rows by default. You can add any number in the () to get that number of rows
ms.tail(10) #and the last 10 rows
ms.describe() #this is a quick way to get summary statistics on a per-column basis
#What do you notice about the number of columns returned by describe vs that in the entire dataframe...
ms.shape

(216, 183)

ms.columns
missing = []
des_cols = ms.describe().columns
for col in ms.columns:
    if col in des_cols:
        print('found: '+ col)
    else:
        missing.append(col)
missing
pd.set_option('display.max_rows', 50) #This will set the number of rows you can "see" in the jupyter notebook when you inspect a dataframe
pd.set_option('display.max_columns', 200) #This will set the number of columns you can "see" in the jupyter notebook when you inspect a dataframe
ms.describe() #notice the difference in the number of columns you can see

Last updated

Was this helpful?