Cheatography
https://cheatography.com
A very quick and simple reference for the Python Pandas Module.
Loading Pandas
Import Pandas Module with the alias pd import pandas as pd
|
Creating Dataframes From Files
From a csv file df = pd.read_csv('file.csv')
|
From a python dictionarydf = pd.DataFrame.from_dict(<dict>)
|
Displaying Dataframe Info
Display first five rows in dataframedf.head()
|
Display last five rows in dataframedf.tail()
|
Show all column namesdf.columns
|
Show all object types in dataframedf.dtypes
|
Show statistics for all int and float columnsdf.describe()
|
Show statistics for 'object' type columnsdf.describe(include='object')
|
Show number of rows and columnsdf.shape
|
Display True for each NaN value, False otherwisedf.isnull()
|
Display a table with the number of NaN values for each columndf.isnull().sum()
|
Updating
Delete all rows containing NaN values in the df Dataframe df.dropna(inplace=True)
|
Delete 'col_name' column df.drop('col_name', axis=1)
|
Example of a calculated columndf['new_col'] = df['col_1'] + df['col_2']
|
Update the entire column to value <value> df['new_col'] = <value>
|
Update the cell at (a,b) to <value> df.iloc[a,b] = <value>
|
Update (or creates) 'col_a' with the result of lambda function applied to 'col_b'df['col_a'] = df['col_b'].apply(<lambda function>)
|
|
|
Filtering Columns
Display an entire column as a seriesdf['column_name']
|
Display all columns in the given listdf[['col_1', 'col_2', ... 'col_n' ]]
|
Show all unique elements in 'column_name'df['column_name'].unique()
|
Filtering Rows
Display all rows satisfying <condition>df[<condition>]
|
Display all rows where df['col_name'] == <value>
df[df['col_name'] == <value>]
|
Show all rows satisfying both conditionsdf[(<condition_1>]) & (<condition_2>)]
|
Indexing with iloc
Displays the entire row indexed ndf.iloc[n]
|
Displays the element in row n & column mdf.iloc[n, m]
|
Displays a slice of rows: from row a to row bdf.iloc[a:b]
|
Displays rows a to b only in the columns c to ddf[a:b, c:d]
|
Indexing with loc
Shows all rows indexed with <ind>
df.loc[<ind>]
|
Manipulating Dataframes
Create a copy of the dataframenew_df = df.copy()
|
Set 'column_name' as the indexdf.set_index('column_name', inplace=True)
|
Delete / Output
Output to csv filedf.to_csv('output.csv')
|
Output to json filedf.to_json()
|
Output to html filedf.to_html()
|
|
|
Created By
Metadata
Comments
No comments yet. Add yours below!
Add a Comment
Related Cheat Sheets