Cheatography

# Basic statistics with R Cheat Sheet (DRAFT) by xeonkai

This is a draft cheat sheet. It is a work in progress and is not finished yet.

### Descri­ptive statistics

 Base instal­lation ``summary(), mean(), sd(), var(), min(), max(), median(), length(), range(), quanti­le(), fivenum()`` ``Hmisc`` package ``describe()`` ``pastecs`` package ``stat.d­esc(x, basic=­TRUE, desc=TRUE, norm=F­ALSE, p=0.75)`` ``basic=TRUE`` - no. of values, null values, missing values, min, max, range, sum ``desc=TRUE`` - median, mean, std error of mean, 95% CI for mean, variance, std dev, coeffi­cient of variation ``norm=TRUE`` - skewness, kurtosis, Shapir­o-Wilk test of normality ``psych`` package ``describe()``
To call function that has been masked, use
``Hmisc:­:de­scr­ibe(x)``

### Descri­ptive statistics by group

 ``aggreg­ate()`` Single value function - ``aggreg­ate­(mt­car­s[v­ars], by=lis­t(a­m=m­tca­rs\$am), mean)`` Several functions - ``by(data, INDICES, FUN)`` ``dstats <- functi­on(­x)(­c(m­ean­=me­an(x), sd=sd(x)))`` ``by(mtc­ars­[vars], mtcars\$am, dstats)`` ``doBy`` package ``summar­yBy­(fo­rmula, data=d­ata­frame, FUN=fu­nction)`` Formula - ``var1 + var2 ... ~ groupvar1 + groupvar2 + ...`` ``summar­yBy­(mp­g+h­p+w­t~am, data=m­tcars, FUN=my­stats)`` ``psych`` package ``descri­be.b­y(­mtc­ars­[vars], mtcars\$am)``

### Freque­ncies and contin­gency tables

 ``table(­var­1,v­ar2­,...,varN)`` Creates an N-way contin­gency table from N catego­rical variables (factors). Ignores missing values (NAs) by default. ``useNA=­"­ifa­ny"`` to include NA as a valid category. ``xtabs(­for­mula, data)`` Creates an N-way contin­gency table based on a formula and a matrix or data frame ``prop.t­abl­e(t­able, margins)`` Expresses table entries as fractions of the marginal table defined by the ``margins`` ``margin.ta­ble­(table, margins)`` Computes the sum of table entries for a marginal table defined by the ``margins`` ``addmar­gin­s(t­able, margins)`` Puts summary ``margins`` (sums by default) on a table ``ftable­(table)`` Creates a compact "­fla­t" contin­gency table

### Example code

 One way table ``mytable <- with(A­rth­ritis, table(­Imp­roved))`` ``prop.t­abl­e(m­ytable) # turn freque­ncies into propor­tions`` ``prop.t­abl­e(m­yta­ble­)*100 # turn freque­ncies into percen­tages`` Two way table ``mytable <- table(­Tre­atment, Improved)`` ``mytable <- xtabs(~ Treatment + Improved, data = Arthritis)`` ``margin.ta­ble­(my­table, 1) # generate marginal freque­ncies, 2 generates column sums`` ``prop.t­abl­e(m­ytable, 1) # generate marginal propor­tions, 2 generates column propor­tions`` ``prop.t­abl­e(m­ytable) # cell propor­tions`` ``addmar­gin­s(m­ytable) # adds a sum row and column`` ``addmar­gin­s(p­rop.ta­ble­(my­table))`` ``addmar­gin­s(p­rop.ta­ble­(my­table, 1), 2) # adds a sum column`` ``addmar­gin­s(p­rop.ta­ble­(my­table, 2), 1) # adds a sum row`` Two way tables can be created using ``Crosst­able()`` function in ``gmodels`` package Three way table ``ftable()`` function can print multid­ime­nsional tables

### Covari­ances / correl­ations

 ``x`` Matrix or data frame ``use`` Specifies the handling of missing data. Options are ``all.obs`` (assumes no missing data - missing data will produce an error), ``everything`` (any correl­ation involving a case with missing values will be set to ``missing``), ``comple­te.obs`` (listwise deletion), and ``pairwi­se.c­om­ple­te.obs``(pairwise deletion). ``method`` Specifies the type of correl­ation. The options are ``pearson``, ``spearman``, or ``kendall``.
Options for
``cov/co­r=(x, use=, method= )``