Cheatography https://cheatography.com

Download This Cheat Sheet (PDF)

Comments
Rating: ()

Kaggle Data Science Cheat Sheet (DRAFT) by blueberryshortcake

for kaggle data science basics

This is a draft cheat sheet. It is a work in progress and is not finished yet.

Creating, Reading, Writing

df = pd.DataFrame({"col0": [val0, val1], "col1": [val0, val1]}, index=[0, 1])	Create a dataframe
series = pd.Series(["val0", "val1", "val2"], index=[0, 1, 2, 3], name="name")	Create a series
read = pd.read_csv("../folder/folder/file.csv", index_col=0)	Read a csv
save.to_csv("file.csv")	Save an existing dataframe as a csv

Indexing, Selecting, Assigning

table.head	Show first 5 rows of a dataframe
table["col"]	Select the col from table
table.col.iloc[0]	Select 1^st value of a col from table
table.iloc[0]	Select 1^st row of data from table
table.col.iloc[:10]	Select 1^st 10 values from col in table (index-based select)
table.col.loc[:10]	Select 1^st 10 values from col in table (label-based select)
table.loc[indices, cols]	Select certain rows from certain cols
table[table.col == 'val']	Select cols have a certain val (conditional select)
table.col.isin(['val1,' 'val2'])	Select cols have certain vals (conditional select)

Summary Functions & Maps

table.col.describe()	Get high-lvl summary of given col's attributes
table.col.mean()	Get mean of a col with numerical vals
table.col.unique()	Get each unique val of a col w/ no dupes
table.col.value_counts()	Get frequency of each val in col
table.col.map(lambda p: p - s)	Map function to remap a Series of point vals (p) by using a transformation (p-s) -> returns new Series
table.apply(func, axis='columns')	Apply function to transform entire df by calling custom method (func taking a row) on each row

Grouping & Sorting

table.groupby('col').col.count()	Group data w/ same vals in the given col -> count frequency of given col (same as value_counts())
table.groupby('col').size()	Same as above
table.groupby('col').apply(lambda df: df.title.iloc[0])	Select name (title) of the 1^st thing in col
table.col.idxmax()	Get index of max val in col
table.groupby(['col0']).col1.agg([f1, f2, f3])	agg() runs diff. funcs. simultaneously on a df
table.groupby(['col0', 'col1']).col2.agg([len])	Multi-index output has tiered structure. Require 2 levels of labels to retrieve a val
df.reset_index()	Muti-index method used to converting back to regular index
df.sort_values(by='col')	Sort rows of data by vals in col (ascending)
df.sort_values(by='col', ascending=False)	Sort rows of data by vals in col (descending)
df.sort_values(by=['col0', 'col1'])	Sort rows by more than 1 col at a time
df.sort_index()	Sort rows by index (default order; ascending)

Data Types & Missing Values

table.col.dtype	Get data type of a col
table.dtypes	Get data types of each col in table
table.col.astype('datatype')	Convert col to datatype if allowed (e,g, int64 -> float64)
table.index.dtype	Number indices are int64
table[pd.isnull(table.col)]	Select NaN entries in a col
table.col.fillna("filler")	Replace all NaN vals in a col with a sentinel val ("Unknown", "Undisclosed", "Invalid") or non-null val
table.col.replace("init_val", "new_val")	Replace, in col, all existing vals with new_vals

Renaming & Combining

table.rename(columns={'init': 'new'})	Rename col or index col names
table.rename(index={0: 'firstEntry', 1: 'secondEntry})	Rename index or col vals by specifying an index or col param
table.rename_axis("name", axis='rows').rename_axis("name1", axis='columns')	Rename row index &/or col index
pd.concat(list, of, els)	Smush together the list of elements along an axis
left.join(right, lsuffix='strL', rsuffix='strR')	Combine diff df objects that have an index in common. left and right are df.s defined beforehand

Download the Kaggle Data Science Cheat Sheet

2 Pages

PDF (recommended)

PDF (2 pages)

Alternative Downloads

Latest Cheat Sheet

7 Pages

(0)

Python Beginner to Advanced Cheat Sheet

A detailed Python cheat sheet covering beginner to advanced topics. Python is a popular programming language that can be used on a server to create web applications and this cheat sheet will cover all essential concepts.

3 Aug 25

python, programming, flask, leetcode, w3school, hackerrank

Random Cheat Sheet

1 Page

(6)

Awesome Window Manager 3.x Cheat Sheet

Default shortcuts for awesome window manager (http://awesome.naquadah.org). The list is copied from awesome man-page.

7 Apr 15, updated 10 May 16

linux, shortcuts, unix, awesome, wm, keybindings

About Cheatography

Cheatography is a collection of 6773 cheat sheets and quick references in 25 languages for everything from business to maths!

Behind the Scenes

If you have any problems, or just want to say hi, you can find us right here:

Recent Cheat Sheet Activity

musmankkh updated Python Beginner to Advanced.
1 day 20 hours ago

linux_china updated dotenvx.
6 days 4 hours ago

AnaPLopes updated Síndromes Glomerulares - Nefrologia.
1 week 2 days ago

Oems updated Git.
1 week 3 days ago

CROSSANT updated Trigonometric Properties and Identities.
1 week 4 days ago

© 2011 - 2025 Cheatography.com | CC License | Terms | Privacy

Latest Cheat Sheets RSS Feed