| List Comprehension
                        
                                    
                        | List comprehension offers a shorter syntax when you want to create a new list based on the values of an existing list. 
Example:
 
Based on a list of fruits, you want a new list, containing only the fruits with the letter "a" in the name.
 
Without list comprehension you will have to write a for statement with a conditional test inside:
 fruits = ["apple", "banana", "cherry", "kiwi", "mango"]
 newlist = []
 for x in fruits:
   if "a" in x:
     newlist.append(x)
 print(newlist)
 
With list comprehension you can do all that with only one line of code:
 fruits = ["apple", "banana", "cherry", "kiwi", "mango"]
 newlist = [x for x in fruits if "a" in x]
 print(newlist)
 |  Imputation
                        
                                    
                        | In statistics, imputation is the process of replacing missing data with substituted values.  
When substituting for a data point, it is known as "unit imputation";  
when substituting for a component of a data point, it is known as "item imputation". 
Pandas Imputation Article |  Aggregate Functions
                        
                                                                                    
                                                                                            | sum() | Sums each value of an object |  
                                                                                            | count() | Returns total Count |  
                                                                                            | median() | Returns mathematical median |  
                                                                                            | quantile([0.25, 0.75]) | Quantiles of an object |  
                                                                                            | min() | Lowest value in an object |  
                                                                                            | max() | Highest Value in an Object |  
                                                                                            | mean() | Returns mathematical mean |  
                                                                                            | var() | Returns mathematical variance |  
                                                                                            | std() | Returns standard deviation |  
                                                                                            | df.groupby(by="col") | Groups data by value of specified column (Similar to SQL)) |  
                                                                                            | pd.merge(adf, bdf, how='left', on'col') | Merges to Datasheets into one based on a common column |  Aggregate Functions are a way of summarizing or reshaping data Shape of a Dataframe
                        
                                    
                        | Return a tuple representing the dimensionality of the DataFrame. >>> df = pd.DataFrame({'col1': [1, 2], 'col2': [3, 4]})
 >>> df.shape
 (2, 2)
 |  Mean
                        
                                    
                        | Return the mean of the values over the requested axis. DataFrame.mean(axis=None, skipna=None, level=None, numeric_only=None)
 |  Median
                        
                                    
                        | Sorts all values in dataframe and returns the middle value DataFrame.median(axis=None, skipna=None, level=None, numeric_only=None)
 |  Creating a Dataframe from Scratch
                        
                                    
                        | # Import pandas library
import pandas as pd
  
# initialize list of lists
data = [['tom', 10], ['nick', 15], ['juli', 14]]
  
# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['Name', 'Age'])
 |  From Scratch means creating the Data by hand Categorical Variable
                        
                                    
                        | Is data that is limited to set or range of values 
They are best visualised using bar plots or balloon plot
Example Article |  Quartiles vs Quantiles
                        
                                    
                        | Quartiles 25th percentiles of Data
 Where as Quantiles can be custom percentiles
 |  Correlation
                        
                                    
                        | Correlation describes the relationship between data. 
 Example:
 If the square footage in an apartment increases, the price of the apartment increases aswell
 |  Scatterplot
                        
                                    
                        | A Scatterplot plots data on an x-y grid |  Histogram
                        
                                    
                        | A histogram plots data on a axis with the count being represented in height |  | 
            
Created By
Metadata
Comments
No comments yet. Add yours below!
Add a Comment
Related Cheat Sheets
More Cheat Sheets by CodingJinxx