Mean , Variance and Standard Deviation
To compute the population mean, all the observed values in the population are summed and divided by the number of observations in the population.
Variance and standard deviation provide a measure of the extent of the dispersion in the values of the random variable around the mean.
Discrete Random Variable
The mean of a population is expressed as:
Variance of a random variable is defined as:
Var(X)= E[(X − μ)2] = E(X2) − [E(X)]2
where μ = E(X) . The square root of the variance is called the standard deviation.
Expected Value
Expected value is the weighted average of the possible outcomes of a random variable, where the weights are the probabilities that the outcomes will occur. The expectation of a random variable X having possible values x1,…, xn is defined as:
E(X) = x1P(X = x1) + … + xnP(X = xn)
Covariance and Correlation
Covariance measures the extent to which two random variables tend to be above and below their respective means for each joint realization. It can be calculated as:
Correlation is a standardized measure of association between two random variables; it ranges in value from –1 to +1 and is equal to:
Relationship of Two variables
If X and Y are any random variables, then:
E(X + Y) = E(X) + E(Y)
If X and Y are independent random variables, then:
Var(X + Y) = Var(X) + Var(Y)
Var(X − Y) = Var(X) + Var(Y)
If X and Y are NOT independent, then:
Var(X + Y) = Var(X) + Var(Y) + 2 × Cov(X,Y)
Var(X − Y) = Var(X) + Var(Y) − 2 × Cov(X,Y)
The Four Central Moments of A Statistical Variable or Distribution
The shape of a probability distribution is characterized by its raw moments and central moments. The first raw moment is the mean of the distribution. The second central moment is the variance. The third central moment divided by the cube of the standard deviation measures the skewness of the distribution, and the fourth central moment divided by the fourth power of the standard deviation measures the kurtosis of the distribution.
Raw moments are measured relative to an expected value raised to the appropriate power. The kth raw moment is the expected value of Rk :
The first raw moment is the mean of the distribution, which is the expected value of returns. Raw moments for k > 1 are not very useful for our purposes, however, central moments for k > 1 are important.
Central moments are measured relative to the mean (i.e., central around the mean). The kth central moment is defined as:
The second central moment is the variance of the distribution, which measures the dispersion of data.
The third central moment measures the departure from symmetry in the distribution. This moment will equal zero for a symmetric distribution (such as the normal distribution). The skewness statistic is the standardized third central moment. Skewness (sometimes called relative skewness) refers to the extent to which the distribution of data is not symmetric around its mean.
The fourth central moment measures the “tailedness” of the distribution. The kurtosis statistic is the standardized fourth central moment of the distribution. Kurtosis refers to how fat or thin the tails are in the data distribution.
Skewness and Kurtosis
Skewness:
Skewness describes the degree to which a distribution is nonsymmetric about its mean.
- A right-skewed distribution has positive skewness and a mean that is higher than the median that is higher than the mode.
- A left-skewed distribution has negative skewness and a mean that is lower than the median that is lower than the mode.
Kurtosis:
Kurtosis measures the probability of extreme outcomes.
- Excess kurtosis is measured relative to a normal distribution, which has a kurtosis of three.
- Positive values of excess kurtosis indicate a leptokurtic distribution (fat tails).
- Negative values of excess kurtosis indicate a platykurtic distribution (thin tails).
Like mean and variance, we can generalize covariance to cross central moments. The third cross central moment is coskewness and the fourth cross central moment is cokurtosis.
The Best Linear Unbiased Estimator
Desirable statistical properties of an estimator include unbiasedness (sign of estimation error is random), efficiency (lower sampling error than any other unbiased estimator), consistency (variance of sampling error decreases with sample size), and linearity (used as a linear function of sample data).