Cheatography

# Python For Data Science - NumPy Library Cheat Sheet (DRAFT) by padma-it

Python For Data Science

This is a draft cheat sheet. It is a work in progress and is not finished yet.

### Numpy 1D Arrays

 Code Expl­anation & OUTPUT import numpy as np np is simply an alias array_1d = np.arr­ay([2, 4, 5, 6, 7, 9]) Creating a 1-D array using a list.np.array() takes in a list or a tuple as argument, and converts into an array print(­arr­ay_1d) [2 4 5 6 7 9] print(­typ­e(a­rra­y_1d)) np.arr­ay(­[it­erator, dtype) np.arr­ay([1, 2, 3, 4], dtype=­'fl­oat32') explicitly set the data typearray([ 1., 2., 3., 4.], dtype=­flo­at32) array_­fro­m_list = np.arr­ay([2, 5, 6, 7]) array_­fro­m_tuple = np.arr­ay((4, 5, 8, 9)) Convert lists or tuples to arraysnp.array(2, 5, 6, 7) will throw an error list_1 = [3, 6, 7, 5] list_2 = [4, 5, 1, 7] array_1 = np.array(list_1) array_2 = np.arr­ay(­list_2) Create two 1-D arrays array_3 = array_­1*a­rray_2 Multiple each element of array 1 with array2OUTPUT:[12, 30, 7, 35] array_4 = array_1 ** 2 [9,36,­49,25] # no loop required np.arr­ay(­[3.14, 4, 2, 3])Unlike Python lists, NumPy is constr­ained to arrays that all contain the same type array([ 3.14, 4. , 2. , 3. ])If types do not match, NumPy will upcast if possible e.g. int upcasted to float

### NumPy Multi-­Dim­ens­ional Arrays

 Syntax and Concepts Example Code Expl­anation & OUTPUT In NumPy, dime­nsi­ons are called axes array_2d = np.arr­ay([[2, 3, 4], [5, 8, 7]]) Creating a 2-D array using two lists axis = 0 refers to the rows axis = 1 refers to the colu­mns print(array_2d)Note arrays dont have commas unlike lists on printing [[2 3 4]  [5 8 7]] np.ones((row_count,column,count),datatype)Default is float Create array of 1snp.ones((5, 3)) 2D array of axes(5 x 3 )of onesnp.ones((5, 3),dty­pe=­int) array(­[[1­., 1., 1.],       [1., 1., 1.],       [1., 1., 1.],       [1., 1., 1.],       [1., 1., 1.]])array([[1, 1, 1],       [1, 1, 1],       [1, 1, 1],       [1, 1, 1],      [1, 1, 1]]) np.zeros((row_count,column,count),datatype):Default is float Create array of 0snp.zeros(4, dtype = np.int)np.zeros((4,2,2), dtype = np.int) array([0, 0, 0, 0])array([[[0, 0],       [0, 0]],       [[0, 0],       [0, 0]],       [[0, 0],       [0, 0]],       [[0, 0],       [0, 0]]]) np.random.random():Default is float Create array of random numbersnp.random.random([3, 4]) array(­[[9.53­309­987­e-01, 7.6100­524­1e-04, 4.1197­873­9e-01, 4.54277232e-01],   [6.878­425­77e-01, 9.0296­550­9e-01, 5.3213­908­1e-01, 5.41951709e-01],   [8.581­887­84e-01, 1.1137­526­7e-01, 8.1163­897­0e-05, 7.0412­102­0e-­01]]) np.arange(start,stop,step,dtype): similar to rangeIf dtype is not given, infer the data type from other input arguments Create array with increments of a fixed step sizenumbers = np.ara­nge(10, 100, 5) [10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95] np.l­ins­pac­e(s­tar­t,s­top­,number of elements,dtype):Default is float Create array of fixed length­;So­met­imes, you know the length of the array, not the step sizenp.linspace(10, 100, 19,dty­pe=int) array([ 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100]) np.full((dimensions),element_to_be_filled)Default is int Creating a 4 x 3 array of 7s using np.full(); default is intnp.full((4,3), 7) array([[7, 7, 7],     [7, 7, 7],     [7, 7, 7],     [7, 7, 7]]) np.tile(arr,repeat_count)Default is int creates a new array by repeating the given array for the given repeated countarr = ([0, 1, 2]) np.tile(arr, 3)np.tile(arr, (3,2)) array([0, 1, 2, 0, 1, 2, 0, 1, 2])array([[0, 1, 2, 0, 1, 2],     [0, 1, 2, 0, 1, 2],     [0, 1, 2, 0, 1, 2]]) np.eye(identity_matrix_element,dtype)Default type is float Create a 3 x 3 identity matrixnp.eye(3, dtype = int) array([[1, 0, 0],     [0, 1, 0],     [0, 0, 1]]) np.random.randint(start,stop,(dimensions))Only integers rand_a­rra­y=n­p.r­and­om.r­an­dint(0, 10, (4,4)) array([[9, 3, 6, 0],     [1, 5, 7, 5],     [4, 2, 6, 4],     [5, 3, 4, 6]]) Print the second row print(­ran­d_a­rray[1, ]) [1, 5, 7, 5] np.emp­ty(3) Create an uninit­ialized array of three integers array([ 1., 1., 1.]) Could be anything from memory array_1d = np.arange(10)print(array_1d) [0 1 2 3 4 5 6 7 8 9] Iter­ation of 1D array is similar to lists for i in array_1d:   print(­i**2) 0 1 4 9 Iter­ating on 2-D arrays is done with respect to the first axis (which is row, the second axis is column) for row in array_2d: print(row) [2 3 4] [5 8 7] 3D array array_3d = np.ara­nge­(24­).r­esh­ape(2, 3, 4) print(­arr­ay_3d) [[[ 0 1 2 3]   [ 4 5 6 7]   [ 8 9 10 11]] [[12 13 14 15]   [16 17 18 19]   [20 21 22 23]]] Iter­ating over 3-D arrays: Done with respect to the first axis for row in array_3d:   print(row) [[[ 0 1 2 3]   [ 4 5 6 7]   [ 8 9 10 11]] [[12 13 14 15]   [16 17 18 19]   [20 21 22 23]]]

### NumPy Array Attributes

 Syntax and Concepts Example Code Expl­anation & OUTPUT array.shapeReturns Shape of array in the form (n x m) print(­"­Shape: {}".f­or­mat­(ra­nd_­arr­ay.s­hape)) Shape: (4, 4) array.ndim Returns number of dimensions (or axes) print(­"­Dim­ens­ions: {}".f­or­mat­(ra­nd_­arr­ay.n­dim)) Dimens­ions: 2 array.dtypeReturns data type (int, float etc.) print(­"­dtype: {}".f­or­mat­(ra­nd_­arr­ay.d­type)) dtype: int32 array.size Returns total number of elements in the array print(­"­Size: ", rand_a­rra­y.size) Size: 16 array.itemsizeReturns Memory used by each array element in bytes print(­"Item size: {}".f­or­mat­(ra­nd_­arr­ay.i­te­msize)) Item size: 4 array.nbytes Returns the total size (in bytes) of the array print(­"­nby­tes­:", rand_a­rra­y.n­bytes, "­byt­es") nbytes: 64 bytes

### NumPy Array Indexing

 Syntax and Concepts Example Code Expl­anation & OUTPUT Access single elemen­ts: x[start:stop:step]Print third element print(­arr­ay_­1d[2]) 2 Access single elemen­ts: x[start:stop:step]Print last element print(­arr­ay_­1d[-1]) 9 Access single element in a 2D arrayPrints second row third column print(­arr­ay_­2d[1, 2]) 8 Access single element in a 2D arrayPrints second row last column print(­arr­ay_­2d[1, -1]) 7

### NumPy Array Slicing

 Syntax and Concepts Example Code Expl­anation & OUTPUT Acce­ssing subarraysReturn specific elements; Index has to be a list x[start:stop:step]print(array_1d[[2, 5, 6]])print(array_1d[2, 5, 6])Default start: 0, stop: size of dimension, step = 1 [2 5 6] using [] instead of[[]] Will throw error array_1d = np.arange(10)print(array_1d) [0 1 2 3 4 5 6 7 8 9] Slice third element onwards print(­arr­ay_­1d[2:]) [2 3 4 5 6 7 8 9] Slice first three elements print(­arr­ay_­1d[:3]) [0 1 2] Slice third to seventh elements print(­arr­ay_­1d[­2:7]) [2 3 4 5 6] Subset starting 0 at increment of 2 print(­arr­ay_­1d[­0::2]) [0 2 4 6 8] Slicing a 2D array print(­arr­ay_­2d[1, :]) [5 8 7] Slicing a 2D array returns an array print(­typ­e(a­rra­y_2d[1, :])) Slicing all rows and the third column print(­arr­ay_­2d[:, 2]) [4 7] Slicing all rows and the first three columns print(­arr­ay_­2d[:, :3]) [[2 3 4]   [5 8 7]] Slicing elements within range with step size print(­arr­ay_­1d[­2:7:2]) [2 4 6] import numpy as np arr1 = np.array([1,2,3,4]) print(arr1) arr2 = arr1[1:] print(arr2) arr2[1] = 8 print(arr1) print(­arr2) Numpy array on slicing will not return new copy. Numpy array slicing will only return a view or reference to original array(like shallow copy) [1 2 3 4] [2 3 4] [1 2 8 4] [2 8 4] list1 = [1,2,3,4] print(list1) list2 = list1[1:] print(list2) list2[1] = 8 print(list1) print(­list2) Lists on slicing will create new copy. [1, 2, 3, 4] [2, 3, 4] [1, 2, 3, 4] [2, 8, 4]

### Reshaping of NumPy Arrays

 Syntax and Concepts Example Code Expl­anation & OUTPUT array.reshape(dimensions)Default is intx = np.arange(24) print(x) gives [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23] print(x.reshape(3, 2, 4)) 3D array created from 1D array using reshape() The last axis has 4 elements, and is printed from left to right. The second last has 3, and is printed top to bottom * The other axis has 2, and is printed in the two separated blocks [[[ 0 1 2 3]   [ 4 5 6 7]]  [[ 8 9 10 11]   [12 13 14 15]]  [[16 17 18 19]   [20 21 22 23]]] arra­y[n­p.n­ewa­xis­,:] creates a row vectorarray[:,np.newaxis] creates a column vector print(x[np.newaxis,:]) print(­x[:­,np.ne­waxis]) [[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]][[ 0] [ 1] [ 2] .... [ 22] [ 23]]

### NumPy Array Concat­enation

 Syntax and Concepts Example Code Expl­anation & OUTPUT np.c­onc­ate­nat­e([­array1, 2,...],axis=0) Default is along x-axisAlong x-axis means adding rows(row-wise)(axis=0)Along y-axis means adding column­s(x­olu­mn-­wis­e)(­axis=1) x = np.arr­ay([1, 2, 3]) y = np.arr­ay([3, 2, 1]) z = [99, 99, 99] print(­np.c­on­cat­ena­te([x, y, z])) The dimensions should match on the axis the arrays are being concatenated.Here x dimensions are 1 x 3y dimensions are 1 x 3z dimensions are 1 x 3 Since columns are same, they can be concat­enated across x-axis[ 1 2 3 3 2 1 99 99 99] Conc­atenate on 2D arrays with same dimens­ions grid = np.arr­ay([[1, 2, 3],         [4, 5, 6]]) np.con­cat­ena­te(­[grid, grid]) np.con­cat­ena­te(­[grid, grid], axis=1) array([[1, 2, 3],     [4, 5, 6],   [1, 2, 3],   [4, 5, 6]]) array([[1, 2, 3, 1, 2, 3],   [4, 5, 6, 4, 5, 6]]) Conc­atenate on arrays with different dimens­ions np.vstack([arrays])np.hstack([arrays])np.dstack([arrays]) np.v­sta­ck(­[ar­ray1, array2])array1 and 2 should have same column size print(­np.v­st­ack­([x­,gr­id])) [[1 2 3]   [1 2 3]   [4 5 6]] np.h­sta­ck(­[ar­ray1, array2])array1 and 2 should have same row size print(np.hstack([x,y]))y = np.array([[99],[99]]) np.hst­ack­([grid, y]) [1 2 3 3 2 1] [[ 1 2 3 99]  [ 4 5 6 99]] np.d­sta­ck(­[ar­ray­s]) same as **np.c­onc­ate­nat­e([­arr­ays­],a­xis=2) will stack arrays along the third axis

### NumPy Array Splitting

 Syntax and Concepts Example Code Expl­anation & OUTPUT np.s­pli­t(a­rray, [split indices as list]) x = [1, 2, 3, 99, 99, 3, 2, 1]x1, x2, x3 = np.spl­it(x, [3, 5])print(x1, x2, x3) Splitting is opposite of Concat­ena­tion. N slice points mentioned will give N+1 arrays[1 2 3] [99 99] [3 2 1] grid = np.array([1,2,3,4,5,6,7,8,9]).reshape((3,3)) print(­grid) vsplit only works on arrays of 2 or more dimensions upper, lower = np.vsp­lit­(grid, [2]) print(upper) print(­lower) [[1 2 3]  [4 5 6]] [[7 8 9]] left, middle, right = np.hsp­lit­(grid, [1,2]) print(left) print(middle) print(right) [[1]  [4]  [7]] [[2]  [5]  [8]] [[3]  [6]  [9]]

### NumPy Aggreg­ation functions

 Syntax and Concepts Example Code Expl­anation & OUTPUT 1D array Aggregate operations m = np.arange(10) print(m) print(sum(m)) print(np.sum(m)) print(np.min(m)) print(np.max(m)) [0 1 2 3 4 5 6 7 8 9] 45 45 0 9 2D array aggregate vs normal functions p = np.arange(9).reshape(3,3) print(p) print(sum(p)) print(np.sum(p)) print(­p.s­um()) [[0 1 2]  [3 4 5]  [6 7 8]] [ 9 12 15] 36 36 print(p.min()) print(p.min(axis=0)) print(­p.m­in(­axi­s=1)) 0 [0 1 2] [0 3 6]

### NumPy Inbuilt Aggregate Functions

 Function Name NaN-safe Version Desc­rip­tion Most aggregates have a NaN-safe counte­rpart that computes the result while ignoring missing values, np.sum np.nansum Compute sum of elements np.prod np.nanprod Compute product of elements np.mean np.nanmean Compute median of elements np.std np.nanstd Compute standard deviation np.var np.nanvar Compute variance np.min np.nanmin Find minimum value np.max np.nanmax Find maximum value np.argmin np.nan­argmin Find index of minimum value np.argmax np.nan­argmax Find index of maximum value np.median np.nan­median Compute median of elements np.per­centile np.nan­per­centile Compute rank-based statistics of elements np.any N/A Evaluate whether any elements are true np.all N/A Evaluate whether all elements are true

### Mathem­atical Operations on NumPy Arrays

 Syntax and Concepts Example Code Expl­anation & OUTPUT np.sin(array_name)np.cos(array_name) a = np.ara­nge(1, 5) print(np.sin(a)) print(np.cos(a)) print(np.exp(a)) print(­np.l­og(a)) [ 0.84147098 0.90929743 0.14112001 -0.7568025 ] [ 0.54030231 -0.416­14684 -0.9899925 -0.65364362] [ 2.71828183 7.3890561 20.085­53692 54.59815003] [0. 0.69314718 1.09861229 1.3862­9436] np.v­ect­ori­ze(­cus­tom­_fu­nct­ion) a = np.arange(5) f = np.vec­tor­ize­(lambda x: x+10) print(­f(a)) [10 11 12 13 14] Custom function on 2D array Previous functions can be reused for 2D arrays too b = np.lin­spa­ce(1, 100, 10,dtype=int) print(b) print(­f(b)) [ 1 12 23 34 45 56 67 78 89 100] [ 11 22 33 44 55 66 77 88 99 110] np.l­inalg Applies common linear algebra operations np.l­ina­lg.i­nv­(ar­ray­_na­me) returns array of inverse of a matrix np.l­ina­lg.d­et­(ar­ray­_na­me) returns determ­inant value of the matrix np.l­ina­lg.e­ig­(ar­ray­_na­me) returns eigenv­alues and eigenv­ectors of the matrix np.d­ot(­arr­ay1­,ar­ray­2)) returns matrix multip­lic­ation