\documentclass[10pt,a4paper]{article} % Packages \usepackage{fancyhdr} % For header and footer \usepackage{multicol} % Allows multicols in tables \usepackage{tabularx} % Intelligent column widths \usepackage{tabulary} % Used in header and footer \usepackage{hhline} % Border under tables \usepackage{graphicx} % For images \usepackage{xcolor} % For hex colours %\usepackage[utf8x]{inputenc} % For unicode character support \usepackage[T1]{fontenc} % Without this we get weird character replacements \usepackage{colortbl} % For coloured tables \usepackage{setspace} % For line height \usepackage{lastpage} % Needed for total page number \usepackage{seqsplit} % Splits long words. %\usepackage{opensans} % Can't make this work so far. Shame. Would be lovely. \usepackage[normalem]{ulem} % For underlining links % Most of the following are not required for the majority % of cheat sheets but are needed for some symbol support. \usepackage{amsmath} % Symbols \usepackage{MnSymbol} % Symbols \usepackage{wasysym} % Symbols %\usepackage[english,german,french,spanish,italian]{babel} % Languages % Document Info \author{sree017} \pdfinfo{ /Title (iqs.pdf) /Creator (Cheatography) /Author (sree017) /Subject (IQs Cheat Sheet) } % Lengths and widths \addtolength{\textwidth}{6cm} \addtolength{\textheight}{-1cm} \addtolength{\hoffset}{-3cm} \addtolength{\voffset}{-2cm} \setlength{\tabcolsep}{0.2cm} % Space between columns \setlength{\headsep}{-12pt} % Reduce space between header and content \setlength{\headheight}{85pt} % If less, LaTeX automatically increases it \renewcommand{\footrulewidth}{0pt} % Remove footer line \renewcommand{\headrulewidth}{0pt} % Remove header line \renewcommand{\seqinsert}{\ifmmode\allowbreak\else\-\fi} % Hyphens in seqsplit % This two commands together give roughly % the right line height in the tables \renewcommand{\arraystretch}{1.3} \onehalfspacing % Commands \newcommand{\SetRowColor}[1]{\noalign{\gdef\RowColorName{#1}}\rowcolor{\RowColorName}} % Shortcut for row colour \newcommand{\mymulticolumn}[3]{\multicolumn{#1}{>{\columncolor{\RowColorName}}#2}{#3}} % For coloured multi-cols \newcolumntype{x}[1]{>{\raggedright}p{#1}} % New column types for ragged-right paragraph columns \newcommand{\tn}{\tabularnewline} % Required as custom column type in use % Font and Colours \definecolor{HeadBackground}{HTML}{333333} \definecolor{FootBackground}{HTML}{666666} \definecolor{TextColor}{HTML}{333333} \definecolor{DarkBackground}{HTML}{A3A3A3} \definecolor{LightBackground}{HTML}{F3F3F3} \renewcommand{\familydefault}{\sfdefault} \color{TextColor} % Header and Footer \pagestyle{fancy} \fancyhead{} % Set header to blank \fancyfoot{} % Set footer to blank \fancyhead[L]{ \noindent \begin{multicols}{3} \begin{tabulary}{5.8cm}{C} \SetRowColor{DarkBackground} \vspace{-7pt} {\parbox{\dimexpr\textwidth-2\fboxsep\relax}{\noindent \hspace*{-6pt}\includegraphics[width=5.8cm]{/web/www.cheatography.com/public/images/cheatography_logo.pdf}} } \end{tabulary} \columnbreak \begin{tabulary}{11cm}{L} \vspace{-2pt}\large{\bf{\textcolor{DarkBackground}{\textrm{IQs Cheat Sheet}}}} \\ \normalsize{by \textcolor{DarkBackground}{sree017} via \textcolor{DarkBackground}{\uline{cheatography.com/126402/cs/24610/}}} \end{tabulary} \end{multicols}} \fancyfoot[L]{ \footnotesize \noindent \begin{multicols}{3} \begin{tabulary}{5.8cm}{LL} \SetRowColor{FootBackground} \mymulticolumn{2}{p{5.377cm}}{\bf\textcolor{white}{Cheatographer}} \\ \vspace{-2pt}sree017 \\ \uline{cheatography.com/sree017} \\ \end{tabulary} \vfill \columnbreak \begin{tabulary}{5.8cm}{L} \SetRowColor{FootBackground} \mymulticolumn{1}{p{5.377cm}}{\bf\textcolor{white}{Cheat Sheet}} \\ \vspace{-2pt}Published 3rd October, 2020.\\ Updated 3rd October, 2020.\\ Page {\thepage} of \pageref{LastPage}. \end{tabulary} \vfill \columnbreak \begin{tabulary}{5.8cm}{L} \SetRowColor{FootBackground} \mymulticolumn{1}{p{5.377cm}}{\bf\textcolor{white}{Sponsor}} \\ \SetRowColor{white} \vspace{-5pt} %\includegraphics[width=48px,height=48px]{dave.jpeg} Measure your website readability!\\ www.readability-score.com \end{tabulary} \end{multicols}} \begin{document} \raggedright \raggedcolumns % Set font size to small. Switch to any value % from this page to resize cheat sheet text: % www.emerson.emory.edu/services/latex/latex_169.html \footnotesize % Small font. \begin{multicols*}{3} \begin{tabularx}{5.377cm}{X} \SetRowColor{DarkBackground} \mymulticolumn{1}{x{5.377cm}}{\bf\textcolor{white}{AI ML DS DL}} \tn \SetRowColor{LightBackground} \mymulticolumn{1}{x{5.377cm}}{AI: Programs that can sense, reason act and adapt. \newline General AI - Planning, decision making, identifying objects, recognizing sounds, social \& \newline business transactions \newline Applied AI - driverless/ Autonomous car or machine smartly trade stocks \newline \newline ML: Instead of engineers "teaching" or programming computers to have what they need \newline to carry out tasks, that perhaps computers could teach themselves – learn something without being \newline explicitly programmed to do so. ML is a form of AI where based on more data, and they can change \newline actions and response, which will make more efficient, adaptable and scalable. e.g., navigation apps and recommendation engines. \newline \newline DS: Data science has many tools, techniques, and algorithms called from these fields, plus \newline others –to handle big data The goal of data science, somewhat similar to machine learning, is to make accurate predictions and to automate and perform transactions in real-time, such as purchasing internet traffic or automatically generating content. \newline \newline DL: It is a ML Technique. uses large neural networks.. Teaches computers to do what comes naturally to humans.} \tn \hhline{>{\arrayrulecolor{DarkBackground}}-} \end{tabularx} \par\addvspace{1.3em} \begin{tabularx}{5.377cm}{X} \SetRowColor{DarkBackground} \mymulticolumn{1}{x{5.377cm}}{\bf\textcolor{white}{SL USL RL}} \tn \SetRowColor{LightBackground} \mymulticolumn{1}{x{5.377cm}}{supervised learning: \newline In a supervised learning model, the algorithm learns on a labeled dataset, to generate reasonable \newline predictions for the response to new data. (Forecasting outcome of new data) \newline • Regression \newline • Classification \newline \newline Unsupervised learning: \newline An unsupervised model, in contrast, provides unlabelled data that the algorithm tries to make sense of by extracting features, co-occurrence and underlying patterns on its own. We use unsupervised learning for \newline • Clustering \newline • Anomaly detection \newline • Association \newline • Autoencoders \newline \newline Reinforcement Learning: \newline Reinforcement learning is less supervised and depends on the learning agent in determining the output solutions by arriving at different possible ways to achieve the best possible solution. \newline \newline \newline Architecture of ML: \newline Business understanding: Understand the give use case, and also, it's good to know more about the \newline domain for which the use cases are built. \newline Data Acquisition and Understanding: Data gathering from different sources and understanding the \newline data. Cleaning the data, handling the missing data if any, data wrangling, and EDA( Exploratory data analysis). \newline Modeling: \newline Feature Engineering - scaling the data, feature selection - not all features are important. We use the backward elimination method, correlation factors, PCA and domain knowledge to select the features. \newline Model Training based on trial and error method or by experience, we select the algorithm and train with the selected features. \newline Model evaluation Accuracy of the model , confusion matrix and cross-validation. If accuracy is not high, to achieve higher accuracy, we tune the model...either by changing the algorithm used or by feature selection or by gathering more data, etc. \newline Deployment - Once the model has good accuracy, we deploy the model Once we deploy, we monitor the performance of the model.if its good...we go live with the model or reiterate the all process until our model performance is good. It's not done yet!!! \newline What if, after a few days, our model performs badly because of new data. In that case, we do all the \newline process again by collecting new data and redeploy the model.} \tn \hhline{>{\arrayrulecolor{DarkBackground}}-} \end{tabularx} \par\addvspace{1.3em} \begin{tabularx}{5.377cm}{X} \SetRowColor{DarkBackground} \mymulticolumn{1}{x{5.377cm}}{\bf\textcolor{white}{Algos}} \tn \SetRowColor{LightBackground} \mymulticolumn{1}{x{5.377cm}}{Linear Regression: \newline Linear Regression tends to establish a relationship between a dependent variable(Y) and one or more \newline independent variable(X) by finding the best fit of the straight line. \newline The equation for the Linear model is Y = mX+c, where m is the slope and c is the intercept \newline \newline OLS Stats Model (Ordinary Least Square): \newline OLS is a stats model, which will help us in identifying the more significant features that can has an \newline influence on the output. OLS model in python is executed as: \newline lm = smf.ols(formula = 'Sales \textasciitilde{} am+constant', data = data).fit() lm.conf\_int() lm.summary() \newline \newline What is Mean Square Error? \newline The mean squared error tells you how close a regression line is to a set of points. It does this by \newline taking the distances from the points to the regression line (these distances are the "errors") and \newline squaring them. \newline \newline Why Support Vector Regression? Difference between SVR and a simple regression \newline model? \newline In simple linear regression, try to minimize the error rate. But in SVR, we try to fit the error within \newline a certain threshold \newline \newline Logistic Regression: \newline The logistic regression technique involves the dependent variable, which can be represented in the \newline binary (0 or 1, true or false, yes or no) values, which means that the outcome could only be in either \newline one form of two. For example, it can be utilized when we need to find the probability of a successful or fail event. \newline \newline Decision Tree: \newline A decision tree is a type of supervised learning algorithm that can be used in classification as well as regressor problems. The input to a decision tree can be both continuous as well as categorical. The decision tree works on an if-then statement. Decision tree tries to solve a problem by using tree \newline representation (Node and Leaf). \newline Assumptions while creating a decision tree: 1) Initially all the training set is considered as a root 2) \newline Feature values are preferred to be categorical, if continuous then they are discretized 3) Records are distributed recursively on the basis of attribute values 4) Which attributes are considered to be in root node or internal node is done by using a statistical approach. \newline \newline How to handle a decision tree for numerical and categorical data? \newline If the feature is categorical, the split is done with the elements belonging to a particular class. \newline If the feature is continuous, the split is done with the elements higher than a threshold. \newline \newline Random Forest: \newline Random Forest is an ensemble machine learning algorithm that follows the bagging technique. The base estimators in the random forest are decision trees. Random forest randomly selects a set of features that are used to decide the best split at each node of the decision tree \newline \newline Variance and Bias tradeoff: \newline Bias: It is the difference between the expected or average prediction of the model and the correct \newline value which we are trying to predict. Imagine if we are trying to build more than one model by \newline collecting different data sets, and later on, evaluating the prediction, we may end up by different \newline prediction for all the models. So, bias is something which measures how far these model prediction \newline from the correct prediction. It always leads to a high error in training and test data. \newline \newline Variance: Variability of a model prediction for a given data point. We can build the model multiple \newline times, so the variance is how much the predictions for a given point vary between different \newline realizations of the model. \newline \newline \newline High Bias, Low Vrariance - Underfitting \newline High Variance, Low Bias - Overfitting \newline \newline A confusion matrix is a table that is often used to describe the performance of a classification model \newline (or "classifier") on a set of test data for which the true values are known. It allows the visualization \newline of the performance of an algorithm. \newline \newline True Positive Rate: \newline Sensitivity (SN) is calculated as the number of correct positive predictions divided by the total number of positives. It is also called Recall (REC) or true positive rate (TPR). The best sensitivity is 1.0, whereas the worst is 0.0. \newline \newline True Negative Rate \newline Specificity (SP) is calculated as the number of correct negative predictions divided by the total number of negatives. It is also called a true negative rate (TNR). The best specificity is 1.0, whereas the worst is 0.0. \newline \newline KNN means K-Nearest Neighbour Algorithm. It can be used for both classification and regression.Also called an instance- based or memory-based learning} \tn \hhline{>{\arrayrulecolor{DarkBackground}}-} \end{tabularx} \par\addvspace{1.3em} \begin{tabularx}{5.377cm}{X} \SetRowColor{DarkBackground} \mymulticolumn{1}{x{5.377cm}}{\bf\textcolor{white}{DL}} \tn \SetRowColor{LightBackground} \mymulticolumn{1}{x{5.377cm}}{What is perceptron and how it is related to human neurons? \newline If we focus on the structure of a biological neuron, it has dendrites, which are used to receive inputs. These inputs are summed in the cell body and using the Axon it is passed on to the next biological neuron. \newline \newline kind of problem can be solved by using deep learning? \newline mage recognition,  Object Detection,  Natural Language processing- Translation, Sentence formations, text to speech, speech to text,  understand the semantics of actions \newline \newline \newline Forward propagation: The inputs are provided with weights to the hidden layer. At each hidden layer, we calculate the output of the activation at each node and this further propagates to the next layer till the final output layer is reached. Since we start from the inputs to the final output layer, we move forward and it is called forward propagation \newline \newline Backpropagation: We minimize the cost function by its understanding of how it changes with changing the weights and biases in a neural network. This change is obtained by calculating the gradient at each hidden layer (and using the chain rule). Since we start from the final cost function and go back each hidden layer, we move backward and thus it is called backward propagation \newline . Backpropagation is the fast, simple and easy to program. \newline  It has no parameters to tune apart from the numbers of input. \newline  It is the flexible method as it does not require prior knowledge about the network \newline  It is the standard method that generally works well. \newline  It does not need any special mentions of the features of the function to be learned \newline \newline Epoch – In the context of training a model, epoch is a term used to refer to one iteration where the model sees the whole training set to update its weights. \newline \newline Regularization \newline ❒ Dropout – Dropout is a technique used in neural networks to prevent overfitting the training \newline data by dropping out neurons with probability p \textgreater{} 0. It forces the model to avoid relying too \newline much on particular sets of features. \newline Remark: most deep learning frameworks parametrize dropout through the 'keep' parameter 1−p. \newline ❒ Weight regularization – In order to make sure that the weights are not too large and that \newline the model is not overfitting the training set, regularization techniques are usually performed on \newline the model weights. The main ones are summed up in the table below \newline \newline \newline Hyperparameter tuning in deep learning: \newline The process of setting the hyper-parameters requires expertise and extensive trial and error. There are no simple and easy ways to set hyper-parameters — specifically, learning rate, batch size, momentum, and weight decay. \newline Approaches to searching for the best configuration: \newline • Grid Search \newline • Random Search \newline \newline Transfer learning – Training a deep learning model requires a lot of data and more importantly a lot of time. It is often useful to take advantage of pre-trained weights on huge datasets \newline that took days/weeks to train, and leverage it towards our use case. Depending on how much \newline data we have at hand, here are the different ways to leverage this: \newline \newline Learning rate – The learning rate, often noted α or sometimes η, indicates at which pace the \newline weights get updated. It can be fixed or adaptively changed. The current most popular method \newline is called Adam, which is a method that adapts the learning rate. \newline \newline Adaptive learning rates – Letting the learning rate vary when training a model can reduce \newline the training time and improve the numerical optimal solution. While Adam optimizer is the \newline most commonly used technique, others can also be useful. They are summed up in the table \newline below:} \tn \hhline{>{\arrayrulecolor{DarkBackground}}-} \end{tabularx} \par\addvspace{1.3em} \begin{tabularx}{5.377cm}{X} \SetRowColor{DarkBackground} \mymulticolumn{1}{x{5.377cm}}{\bf\textcolor{white}{QS}} \tn \SetRowColor{LightBackground} \mymulticolumn{1}{x{5.377cm}}{Why can we not use Multi layer perceptron for text data? \newline A multi layer perceptron assumes that the input is of constant size. In natural language, the length of the sentences can vary. There multi layer perceptron is not applicable. \newline \newline Can we use a multi layer perceptron for regression purpose, how? \newline A multi layer perceptron can be used for a regression problem, it can be done by removing the activation function at the output node and using a suitable cost function. \newline \newline What is the difference between Stochastic Gradient Descent and Batch \newline Gradient Descent? \newline Stochastic Gradient Descent uses only one instance to compute the loss and update the parameters. As a result, it converges faster but often yields a sub-optimal solution. \newline Batch gradient Descent on the other hand uses the whole data to compute the loss and \newline update the parameters. As a result, it converges at the slowest rate but guarantees an almost optimal solution. \newline \newline What is the significance of 1 X 1 convolutions? \newline 1x1 convolutions are primarily used as a dimensionality reduction technique, it is primarily used to vary the number of filters in the convolution layers, it can be used to either increase or decrease the number of filters. \newline \newline What is object detection and object localisation ? \newline Object detection is a process in which the model predicts whether the object is present \newline in the image or not. \newline In object localisation, the model outputs the coordinate values where the object is present within the image, given the object was present in image. \newline \newline Explain the concept of dead unit? \newline A dead unit in a deep neural network is a neuron which is experiencing the vanishing gradient problem and covariate shift. In this state, the neuron learns extremely slowly or apparently stops learning altogether. \newline \newline What do you mean by Learning rate? \newline Learning Rate is a factor by which the weights of neural network are updated in each cycle. \newline \newline what is covariate shift and how is the problem solved? \newline Covariate shift is a condition when the neurons in deep neural networks stop learning or \newline learn extremely slow due to the vanishing gradient. It can be solved by (or combination of) following \newline Batch normalisation \newline Careful Initialisation \newline Slow learning rate \newline dropout \newline \newline \newline In training a neural network, you notice that the loss does not decrease in the few starting epochs. What could be the possible reasons? \newline There are 4 possible scenarios where this is possible, \newline Learning rate is very low \newline Regularisation parameter is very high \newline Optimisation is stuck in the local optima \newline Optimisation started on a plateau \newline \newline \newline Dropout and DropConnect are both regularization techniques for Neural Network. Is there a difference between these two? How is setting dropout =0.3 different from drop connect =0.3? \newline The function drop-out in a layer assigns a probability p to every node in that layer such \newline that, that node will not be included in the computation during the runtime with respect to probability p. (0.3 in question) \newline \newline The function drop-connection in a layer is a probability p for every node in that layer such that, there is a chance p such that , that node will skip a connection to consecutive layer by the probability p. \newline \newline What are the factors to select the depth of neural network? \newline A. Type of neural network (eg. MLP, CNN etc) \newline B. Input data \newline C. Computation power, i.e. Hardware capabilities and software capabilities \newline D. Learning Rate \newline E. The output function to map \newline \newline What the problems with deep networks? \newline In case of very deep neural networks, the weights of the hidden layer might experience \newline either vanishing and explosive gradient problem. They are also prone to "overfitting" the data. \newline It is difficult to train a neural network and they take a very long time to train. \newline \newline How are weights initialized in a neural network? \newline Usually weights can be randomly initialised to a random small value, but that can lead to \newline vanishing gradients and exploding gradient in case of Deep Neural Network. Therefore it \newline is always a good practice to initialise weights using 'He initialisation" or "Xavier's \newline Initialisation".} \tn \hhline{>{\arrayrulecolor{DarkBackground}}-} \end{tabularx} \par\addvspace{1.3em} \begin{tabularx}{5.377cm}{X} \SetRowColor{DarkBackground} \mymulticolumn{1}{x{5.377cm}}{\bf\textcolor{white}{Tensorflow}} \tn \SetRowColor{LightBackground} \mymulticolumn{1}{x{5.377cm}}{At a high level, TensorFlow is a Python library that allows users to express arbitrary computation as \newline a graph of data flows. Nodes in this graph represent mathematical operations, whereas edges \newline represent data that is communicated from one node to another. Data in TensorFlow are represented \newline as tensors, which are multidimensional arrays. Although this framework for thinking about \newline computation is valuable in many different fields, TensorFlow is primarily used for deep learning in \newline practice and research \newline \newline How to write a code to start session for the training? \newline with tf.Session() as sess:} \tn \hhline{>{\arrayrulecolor{DarkBackground}}-} \end{tabularx} \par\addvspace{1.3em} \begin{tabularx}{5.377cm}{X} \SetRowColor{DarkBackground} \mymulticolumn{1}{x{5.377cm}}{\bf\textcolor{white}{Image Segmentation}} \tn \SetRowColor{LightBackground} \mymulticolumn{1}{x{5.377cm}}{Image segmentation is a further extension of object detection in which we mark the presence of an object through pixel-wise masks generated for each object in the image. This technique is more granular than bounding box generation because this can helps us in determining the shape of each object present in the image. This granularity helps us in various fields such as medical image processing, satellite imaging, etc. There are many image segmentation approaches proposed recently. One of the most popular is Mask R-CNN \newline \newline Instance Segmentation: Identifying the boundaries of the object and label their pixel with different colors. \newline Semantic Segmentation: Labeling each pixel in the image (including background) with different colors based on their category class or class label.} \tn \hhline{>{\arrayrulecolor{DarkBackground}}-} \end{tabularx} \par\addvspace{1.3em} % That's all folks \end{multicols*} \end{document}