Statistics Cheatsheet

Measures of Central Tendency

Mean () (population)

where:

= total number of data points
= each data point

Mean () (sample)

where:

= number of data points in the sample

Median

Ifis odd:

Ifis even:

Mode

Value that appears most frequently in the dataset.

Measures of Dispersion

Variance () (population)

Standard Deviation () (population)

Variance () (sample)

Standard Deviation () (sample)

Range

Discrete Distribution

Discrete distributions are those that describe probabilities of variables that can only take specific and finite values, such as integers.

Binomial Distribution

Models the number of successes in a sequence of independent trials, each with the same probability of success.

where:

= number of trials
= number of successes
= probability of success in a single trial
= binomial coefficient

Negative Binomial Distribution

Models the number of failures before achieving a fixed number of successes in Bernoulli trials.

where:

is the number of successes required
is the probability of success
is the number of failures

Geometric Distribution

Models the number of trials until the first success in a sequence of Bernoulli trials.

where:

is the probability of success
is the number of trials until the first success

Hypergeometric Distribution

Models the number of successes in a fixed-size sample drawn without replacement from a finite population.

where:

is the population size
is the number of successes in the population
is the sample size

Poisson Distribution

Models the number of events that occur in a fixed interval of time or space when events occur with a constant average rate.

where:

is the average rate of occurrence of the events
is the number of events

Continuous Distribution

Normal (Gaussian) Distribution

Models many variables in nature and society. It is symmetric and bell-shaped.

where:

is the mean
is the variance

Cumulative Distribution Function of the Normal (CDF)

Exponential Distribution

Models the time between events in a Poisson process. It is used in reliability theory and waiting times.

where:

is the rate of occurrence of the events.

Uniform Distribution

Models a variable that has the same probability of taking any value within a defined interval.

where:

andare the limits of the interval.

Gamma Distribution

Generalizes the exponential distribution. Models the time untilevents occur in a Poisson process.

where:

is a shape parameter
is a rate parameter

Beta Distribution

Primarily used in Bayesian statistics to model probability distributions in proportions or probabilities.

where:

andare shape parameters
is the beta function

Cauchy Distribution

Models phenomena where the mean and variance are undefined or infinite.

where:

is the location
is the scale parameter

Student’s t-distribution

Used to estimate the mean of a normally distributed population when the sample size is small and the variance is unknown.

where:

are the degrees of freedom, which determine the shape of the distribution. Asincreases, the t-distribution converges to a standard normal distribution.

Descriptive Statistics

Measures of Central Tendency

Measures of Dispersion

Probabilities

Basic Concepts

Probability Rules

Law of Large Numbers

Combinatorics

Permutations

Combinations

Binomial Theorem

Bernoulli Trials

Probability Distributions

Discrete Distribution

Continuous Distribution

Correlation and Regression

Correlation

Linear Regression

Statistical Inference

Parameter Estimation

Confidence Intervals

Hypothesis Testing

Correlation and Regression

Correlation

Linear Regression