Introduction
Go is the future for doing data science. In this cheatsheet, we look at 2 families of libraries that will allow you to do that.
They are: gorgonia.org/tensor
and gonum.org/v1/gonum/mat
. The gonum libraries will be referred to as gonum/mat
For this cheatsheet, assume the following:
import ts "gorgonia.org/tensor"
|
Note on panic and error behaviour:
1. Most tensor
operations return error
.
2. gonum
has a good policy of when errors are returned and when panics happen.
What To Use
I ever only want a float64
matrix or vector
|
I want to focus on doing statistical/scientific work
|
I want to focus on doing machine learning work use gonum/mat
or gorgonia.org/tensor
.
|
I want to focus on deep learning work use gorgonia.org/tensor
|
I want multidimensional arrays use gorgonia.org/tensor
, or []mat.Matrix
|
I want to work with different data types use gorgonia.org/tensor
|
I want to wrangle data like in Pandas or R - with data frames
|
Default Values
Numpya = np.Zeros((2,3))
|
gonum/mata := mat.NewDense(2, 3, nil)
|
tensora := ts.New(ts.Of(Float32), ts.WithShape(2,3))
|
A Range... |
Numpya = np.arange(0, 9).reshape(3,3)
|
gonuma := mat.NewDense(3, 3, floats.Span(make([]float64, 9), 0, 8) |
tensora := ts.New(ts.WithBacking(ts.Range(ts.Int, 0, 9), ts.WithShape(3,3))
|
Identity Matrices |
|
gonum/mata := mat.NewDiagonal(3, []float64{1, 1, 1})
|
|
Elementwise Arithmetic Operations
Addition |
Numpyc = a + b
c = np.add(a, b)
a += b # in-place
np.add(a, b, out=c) # reuse array
|
gonum/matc.Add(a, b)
a.Add(a, b) // in-place
|
tensorvar c *ts.Dense; c, err = a.Add(b)
var c ts.Tensor; c, err = ts.Add(a, b)
a.Add(b, ts.UseUnsafe()) // in-place
a.Add(b, ts.WithReuse(c)) // reuse tensor
ts.Add(a, b, ts.UseUnsafe()) // in-place
ts.Add(a, b, ts.WithReuse(c)) // reuse Note: The operations all returns a result and an error, omitted for brevity here. It's good habit to check for errors. |
Subtraction |
Numpyc = a - b
c = np.subtract(a, b)
|
|
tensorc, err:= a.Sub(b)
c, err = ts.Sub(a, b)
|
Multiplication |
Numpyc = a * b
c = np.multiply(a, b)
|
gonum/matc.MulElem(a, b)
|
tensorc, err := a.Mul(b)
c, err := ts.Mul(a, b)
|
Division |
Numpyc = a / b
c = np.divide(a, b)
|
gonum/matc.DivElem(a, b)
|
tensorc, err := a.Div(b)
c, err := ts.Div(a, b) Note: When encountering division by 0 for non-floats, an error will be returned, and the value at which the offending value will be 0 in the result. |
Note: All variations of arithmetic operations follow the patterns available in Addition for all examples.
Note on Shapes
In all of these functions, a
and b
has to be of the same shape. In Numpy operations with dissimilar shapes will throw an exception. With gonum/mat
it'd panic. With tensor
, it will be returned as an error.
Aggregation
Sum |
Numpys = a.sum() s = np.sum(a)
|
gonum/matvar s float64 = mat.Sum(a)
|
tensorvar s *ts.Dense = a.Sum()
var s ts.Tensor = ts.Sum(a) Note: The result, which is a scalar value in this case, can be retrieved by calling s.ScalarValue()
|
Sum Along An Axis |
Numpys = a.sum(axis=0) s = np.sum(a, axis=0)
|
gonum/mat Write a loop, with manual aid from mat.Col
and mat.Row Note: There's no performance loss by writing a loop. In fact there arguably may be a cognitive gain in being aware of what one is doing. |
tensorvar s *ts.Dense = a.Sum(0)
var s ts.Tensor = ts.Sum(a, 0)
|
Argmax/Argmin |
Numpyam = a.argmax()
am = np.argmax(a)
|
gonumWrite a loop, using mat.Col
and mat.Row
|
tensorvar am *ts.Dense; am, err = a.Argmax(ts.AllAxes)
var am ts.Tensor; am, err = ts.Argmax(a, ts.AllAxes) |
Argmax/Argmin Along An Axis |
Numpyam = a.argmax(axis=0)
am = np.argmax(a, axis=0)
|
gonumWrite a loop, using mat.Col
and mat.Row
|
tensorvar am *ts.Dense; am, err = a.Argmax(0)
var am ts.Tensor; am, err = ts.Argmax(a, 0)
|
Data Structure Creation
Numpya = np.array([1, 2, 3])
|
gonum/mata := mat.NewDense(1, 3, []float64{1, 2, 3})
|
tensora := ts.New(ts.WithBacking([]int{1, 2, 3})
|
Creating a float64 matrix |
Numpya = np.array([[0, 1, 2], [3, 4, 5]], dtype='float64')
|
gonum/mata := mat.NewDense(2, 3, []float64{0, 1, 2, 3, 4, 5})
|
tensora := ts.New(ts.WithBacking([]float64{0, 1, 2, 3, 4, 5}, ts.WithShape(2, 3))
|
Creating a float32 3-D array |
Numpya = np.array([[[0, 1, 2], [3, 4, 5]], [[100, 101, 102], [103, 104, 105]]], dtype='float32')
|
tensora := ts.New(ts.WithShape(2, 2, 3), ts.WithBacking([]float32{0, 1, 2, 3, 4, 5, 100, 101, 102, 103, 104, 105})) |
Note: The the tensor
package is imported as ts
Additionally, gonum/mat
actually offers many different data structures, each being useful to a particular subset of computations. The examples given in this document mainly assumes a dense matrix.
gonum Types
|
Abstract data type representing any float64
matrix |
|
Data type representing a dense float64
matrix |
tensor Types
|
An abstract data type representing any kind of tensors. Package functions work on these types. |
|
A representation of a densely packed multidimensional array. Methods return *tensor.Dense
instead of tensor.Tensor |
|
A representation of compressed sparse row/column matrices. |
|
Coming soon - representation of masked multidimensional array. Methods return *tensor.MA
instead of tensor.Tensor |
|
Utility type that represents densely packed multidimensional arrays |
|
Utility type that represents densely packed multidimensional arrays that are masked by a slice of bool |
|
Utility type that represents any sparsely packed multi-dim arrays (for now: only *CS
) |
|
|
Metadata
Metadata |
Numpy |
gonum |
tensor |
Shape |
|
|
|
Strides |
|
|
|
Dims |
|
|
|
Tensor Manipulation
Zero-op Transpose |
|
|
|
Transpose With Data Movement |
NumpyaT = np.transpose(a)
|
gonum/matb := a.T(); aT := mat.DenseCopyOf(b)
|
tensoraT, err := ts.Transpose(a) ora.T(); err := a.Transpose()
|
Reshape |
Numpyb = a.reshape(2,3)
|
gonum/matb := NewDense(2, 3, a.RawMatrix().Data)
|
tensorerr := a.Reshape(2,3)
|
Note on reshaping when using gonum: the matrix a
mustn't be a view.
Linear Algebra
Inner Product of Vectors |
|
gonumvar c float64 = mat.Dot(a, b)
|
tensorvar c interface{} = ts.Inner(a, b) orvar c interface{} = a.Inner(b) Note: The tensor
package comes with specialized execution engines for float64
and float32
which will return float64
or float32
without returning an interface{}
|
Matrix-Vector Multiplication |
Numpy mv = np.dot(m, v) ormv = np.matmul(m, v) ormv = m @ v ormv = m.dot(v)
|
|
tensorvar mv ts.Tensor; mv, _ = ts.MatVecMul(m, v) orvar mv *Dense; mv, _ = m.MatVecMul(v)
|
Matrix-Matrix Multiplication |
Numpy mm = np.dot(m1, m2) ormm = np.matmul(m1, m2) ormm = m1 @ m2 ormm = m1.dot(m2)
|
|
tensorvar mm Tensor; mm, _ = ts.MatMul(m1, m2) orvar mm *ts.Dense; mm, _ = m1.MatMul(m2)
|
Magic |
Numpyc = np.dot(a, b)
c = a.dot(b)
|
tensorvar c ts.Tensor; c, _ = ts.Dot(a, b)
var c *ts.Dense; c, _ = a.Dot(b)
|
Note: The Dot
function and method in package tensor
works similarly to dot
in Numpy - depending on the number of dimensions of the inputs, different functions will be called. You should treat it as a "magic" function that does products of two multi-dimensional arrays. |
gonum
has a whole suite of linear-algebra functions and structures that are too many to enumerate here. You should check it out too.
Combinations
Concatenation |
Numpy c = np.concatenate((a, b), axis=0)
|
|
tensorvar c ts.Tensor; c, err = ts.Concat(0, a, b)
var c *ts.Dense; c, err = a.Concat(0, b)
|
Vstack |
Numpy c = np.vstack((a, b))
|
|
tensorvar c *ts.Dense; c, err = a.Vstack(0, b)
|
Hstack |
Numpy c = np.hstack((a, b))
|
gonum/matc.Augment(a,b)
|
tensorvar c *ts.Dense; c, err = a.Hstack(0, b)
|
Stack onto a New Axis |
Numpy c = np.stack((a, b))
|
gonum/matvar stacked []mat.Matrix; stacked = append(stacked, a, b)
|
tensorvar c ts.Tensor; c, _ = ts.Stack(0, a, b)
var c *ts.Dense; c, _ = a.Stack(0,b) Note: Unlike in Numpy, Stack
in tensor
is a little more strict on the axis. It has to be specified. |
Repeats |
Numpy c = np.repeat(a, 2) # returns a flat array
c = np.repeat(a, 2, axis=0) # repeats along axis 0
c = np.repeat(a, 2, axis=1) # repeats along axis 1
|
gonum Unsupported for now |
tensorvar c ts.Tensor; c, _ = ts.Repeat(a, ts.AllAxes, 2) // returns a flat array
c = ts.Repeat(a, 0, 2) // repeats along axis 0
c = ts.Repeat(a, 1, 2) // repeats along axis 1
|
Data Access
Value At (Assuming Matrices) |
|
gonum/matvar va float64 := a.At(0, 0)
|
tensorvar val interface{}; val, _ = a.At(0,0)
|
Slice Row or Column (Assuming Matrices) |
Numpyrow = a[0]
col = a[:, 0]
|
gonum/matvar row mat.Vector = a.RowView(0)
var col mat.Vector = a.ColView(0)
|
tensorvar row ts.View = a.Slice(s(0))
var col ts.View = a.Slice(nil, s(0))
|
Advanced Slicing (Assuming 9x9 Matrices) |
|
gonum/matvar b mat.Matrix = a.Slice(1,4, 3,6)
|
tensorvar b ts.View = a.Slice(rs(1,4), rs(3,6))
|
Advanced Slicing With Steps |
|
gonum/mat Unsupported |
tensorvar b ts.View = a.Slice(rs(1,4,1), rs(3,6,2))
|
Getting Underlying Data |
|
gonum/matvar b []float64 = a.RawMatrix().Data
|
tensorvar b interface{} = a.Data()
|
Setting One Value (Assuming Matrices) |
|
gonum/mata.Set(r, c, 100)
|
tensora.SetAt(100, r, c)
|
Setting Row/Col (Assuming 3x3 Matrix) |
Numpya[r] = [1, 2, 3]
a[:, c] = [1, 2, 3]
|
gonum/mata.SetRow(r, []float64{1, 2, 3})
a.SetCol(c, []float64{1, 2, 3})
|
tensorNo simple method - requires Iterator
s and multiple lines of code. |
Note: in the tensor
examples, the a.Slice
method take a list of tensor.Slice
which is an interface defined here. s
, and rs
in the examples simply represent types that implement the tensor.Slice
type. A nil
is treated as a :
in Python. There are no default tensor.Slice
types provided, and it is up to the user to define their own.
|
Created By
Metadata
Favourited By
Comments
No comments yet. Add yours below!
Add a Comment
Related Cheat Sheets