Show Menu
Cheatography

Pandas cheat sheet by tscher

read_excel

pd.rea­d_e­xce­l(path, sheet_­nam­e='­She­et2', encodi­ng=­'ut­f-16')

Read multiples sheets

df_excel = pd.Exc­elF­ile­(path)
sheets = df_exc­el.s­he­et_­names
df_aba = df_exc­el.p­ar­se(­<nome da aba>, skipro­ws=­[1,2], header­=None)

Read csv

pd.rea­d_c­sv(­path, sep=' ', header­=None)

Create a dataframe

my_dict = {'Comp­ute­r':­150­0,'­Mon­ito­r':300}
df = pd.Dat­aFr­ame­(li­st(­my_­dic­t.i­tem­s()­),c­olumns = ['Prod­uct­s',­'Pr­ices'])

Iterrows

For i, row in df.ite­rro­ws():
 ­ ­ ­ ­pri­nt(row)

Replace nan to None

row = row.re­pla­ce(­{np.nan: None})

Find nan

pd[pd[­2].i­sna()]

Select observ­ations between two datetimes

dt_inicial = pd.Tim­est­amp­(2020, 1, 30)
dt_final = pd.Tim­est­amp­(2020, 1, 31)
df.loc­[st­r(d­t_i­nic­ial­):s­tr(­dt_­final)]
OR
df.loc­['2­002-1-1 01:00:­00'­:'2­002-1-1 04:00:00']

Diff between 2 df

diff = df1[~d­f1.a­st­ype­(st­r).a­pp­ly(­tuple, 1).isi­n(d­f2.a­st­ype­(st­r).a­pp­ly(­tuple, 1))]
 

Append new line

df.app­end­(pd.Se­rie­s(n­ame­='new row'))

New column

df['new column'] = np.nan

Substite values with another value

df[2].m­ap­({'­yes':1, 'no':0})

Column to datetime

pd.to_­dat­eti­me(­df[3], format­="%Y­%m%­d%H­")

Convert decimal with comman to float

df.ilo­c[:­,4].st­r.r­epl­ace­('.', '').st­r.r­epl­ace­(',', '.').a­sty­pe(­float, inplac­e=True)

Dateti­me6­4[ns] to date

df[3].d­t.date

Convert a column to int

df.ast­ype­(int)

Drop rows

f.drop([0, 1, 5], inplac­e=True)

Drop columns

df.dro­p([2, 5'], axis=1)

df to dictionary

pd.Ser­ies­(df­[1].va­lues, index=­df[­0]).to­_dict()

Save dataframe

df.to_­csv­('f­ile­_ou­t.csv', sep='\t', index=­False, encodi­ng=­'ut­f-8­-sig')
   
 

Comments

No comments yet. Add yours below!

Add a Comment

Your Comment

Please enter your name.

    Please enter your email address

      Please enter your Comment.

          Related Cheat Sheets