Data Analysis in Genomics and Transdriptomics Cheat Sheet

Self-Organizing Map (SOM)

- create a 1D/2D lattice of artificial neurons
- define no of neurons in each dimension
- assign random weights in same dim as input
- select random data point from input
- choose winning neuron based on similarity with input
- update weights of winning and neighboring neurons (based on learning rate & neighborhood function)
- repeat for all data points until convergence

Principle Component Analysis (PCA)

Algorithm	- obtain distance matrix - construct matric matrix - compute eigen values and eigen vectors of matric matrix - compute cartesian coordinates
PCA focuses on maximizing variance, and compromises resolution of proximal clusters.
A distance matrix doesn't reveal the underlying dimensionality of the space in which these points exist.

t-distributed Stochastic Neighbor Embedding (tSNE)

Algorithm	- Similarity score for all points against all points are computed. - The points are then randomly placed on 2- or 3-dimensional space. - Using an optimization method, points are moved step by step based on the similarity score until convergence is achieved
Retains resolution for close clusters, while scaling the farther clusters to fit in frame.
Used in combination with PCA.

K-Means Clustering

- choose no of clusters (k)
- randomly initialise k cluster centroids
- calculate distance between each data point and each centroid (euclidean or manhattan distance)
- assign data point to cluster whose centroid is closest
- recalculate centroid by taking mean of all data points assigned to the cluster
- iterate until stopping criteria met:
1. max no of iterations reached
2. centroids no longer change significantly

Download the Data Analysis in Genomics and Transdriptomics Cheat Sheet

1 Page

Latest Cheat Sheet

7 Pages

(0)

Python Beginner to Advanced Cheat Sheet

A detailed Python cheat sheet covering beginner to advanced topics. Python is a popular programming language that can be used on a server to create web applications and this cheat sheet will cover all essential concepts.

musmankkh

3 Aug 25

python, programming, flask, leetcode, w3school, hackerrank

Recent Cheat Sheet Activity

Data Analysis in Genomics and Transdriptomics Cheat Sheet (DRAFT) by lemonbuzz

Self-Organizing Map (SOM)

Principle Component Analysis (PCA)

t-distributed Stochastic Neighbor Embedding (tSNE)

K-Means Clustering

Latest Cheat Sheet

Random Cheat Sheet

About Cheatography

Behind the Scenes

Recent Cheat Sheet Activity

Please Disable Your Ad Blocker

Data Analysis in Genomics and Transdriptomics Cheat Sheet (DRAFT) by lemonbuzz

Self-O­rga­nizing Map (SOM)

Principle Component Analysis (PCA)

t-dist­ributed Stochastic Neighbor Embedding (tSNE)

K-Means Clustering

Latest Cheat Sheet

Random Cheat Sheet

About Cheatography

Behind the Scenes

Recent Cheat Sheet Activity

Please Disable Your Ad Blocker

Self-Organizing Map (SOM)

t-distributed Stochastic Neighbor Embedding (tSNE)