Multiple Conditional Filtering
mask1 = df[col1'] == 'value'
- Returns True/False boolean series
mask2 = df[col2'] <= 'value'
mask3 = df[col3'] >= 'value'
df[(mask1 & mask2) | mask3]
Inclusion Check
mask = df['col'].isin(['val1','val2','val3'])
- Check for inclusion using isin() method.
mask1 = df[col1'] == 'val1'
- isin method is equal to three conditional checks.
mask1 = df[col1'] == 'val2'
mask1 = df[col1'] == 'val3'
For NULL values
mask = df['col'].isnull()
- Returns True/False boolean series
mask = df['col'].notnull()
- Returns True/False boolean series
Inclusion Check within a range
mask = df['col'].between(val1, val2)
-Returns True/False boolean series. True for values within range.
Duplicate Values
mask = df['col'].duplicated(keep = "First/Last/False")
- Returns boolean series, True for Duplicates
df.drop_duplicates()
-Deletes duplicate from df. if applied on df complete row should be identical.
df.drop_duplicates(subset = ['col1','col2'])
- Drops if the combination of col1 and col2 are identical.