A probability space is a triplet that models our random experiment by means of a probability measure defined on subsets of the sample space belonging to -algebra .
Random variables provide a formalism to map the outcomes of a random experiment (a variable) to numerical values.
A random variable is a mapping from the sample space to real numbers: . Each element of the sample space is assigned by a numerical value . The generic or unkown outcome of an experiemnt is denoted by .
Random variables with a finite set of possible outcomes are simple. Those that are countable are discrete. Otherwise, they are continuous.
1. Induced Probability
Using the probability measure defined on , we may obtain a new probability function on teh random variable in using the following procedure: let be , then .
The image of under is called the support of: . is defined .
2. Cumulative Distribution Function
The cumulative distribution function of a random variable , written , is the probability that takes a value less than or equal to : .
It offers an alternative way to describute the probability measre on , enabling a unified tratment of discrete and continuous random variables. For any random variable , is right continuous, meaning if a decreasing sequence of real numbers then . To check if a CDf is valid, we need to verify:
Monotonicity: .
, .
is right continuous.
Note that the first two conditions imply.
For finite intervals it is possible to chheck that . This can be done after noting that the event may be written as .
3. Discrete Random Variables
A random variable is discrete if it can only take a countable number of possible values: is discrete is countable.
Suppose that is discrete, and is ordered such that . The subset is thus constant as we increase in an interval . Once we reach , grows larger too include outcomes that map to . Thus, will be a monotonic increasing step function with vertical jumps at points in . , or .
For a DRV and , we define the probability mass function as . If can take values , then , and .
The PMF and CDF of a DRV fully characterise its probability distribution. Indeed, we can derive each as:
.
.
3.1 Expectation
For a DRV we define the expectation of as . This is the weighted average of the possible values of , or the mean of the distribution of .
More generally, for a measurable function , we notice that (where ) is also a random variable, where .
For some linaer function :
3.2 Moments
The expectation of a function gives us the th moment of . The central moment is similarly defined, but re-centered to characterize the deviation from the mean. The variance of (denoted ) is the second central moment of , a measure of variability of around its mean:
Corresponding to the linearity of expectation, we have .
The standard deviation of a random variable , written as or , is the square root of the variance: .
The skewness of a DRV is a measure of its assymmetry. It is expressed as a standardized moment. Commonly, and , so .
3.3 Sum of Random Variables
Let , , , be random variables, with different distributions and not necessarily independent. Let be their sum, and be their average. Then:
.
.
If the varaibles are independent, then:
.
.
If the variables are identically distributed and independent, then:
.
.
3.4 Bernoulli Distribution
Consider an experiment with only two possible outcomes, encoded as a random variable taking values 1 with probabily , and 0 with probability . Then, we say , with the PMF:
Using the formulae for mean and variance, it follows that:
3.5 Binomial Distribution
Consider identical trials . Let be the total number of s observed in the trials. Then, we say takes values in and , with the PMF:
Remembering that , we can derive the mean and variance of :
Similarly, the skewness is:
3.6 Geometric Distribution
Consider a potentially infinite sequence of independent random variables . Let be the index of the first successful trial: . Then, we say takes values in and , with the PMF:
The mean and variance of are:
The skewness is:
3.7 Poisson Distribution
Poisson is coerned with the number of random events occuring per unit time or space. Let be a random variable on . Then for some , with the PMF:
The poisson distribution has equal mean and variance:
Poisson Non-Unit Intervals
In this case, can be used instead of . Thus, is the rate at which events occur and is the mean number of events that occur in .
3.8 Discrete Uniform Distribution
Let be a random variable on . Then is said to follow a discrete uniform distribution, or , with the PMF:
The mean and variance are:
4. Continuous Random Variables
Suppose we have a random experiment with sample space and probability measure . For any random variable we have defined the induced probability . We define the random variable to be continuous if such that:
In which case, is the probability density function of .
We can calculate the probability that a CRV lies in as: . Hence:
In the limit , we get .
Hence, for any elementary event , the probability is zero. Or, .
Hence, any countable subset of will have a zero probability measure.
Hence, the support of a CRV must be uncountable, otherwise the probabilities could not sum to one.
As the cdf is a function of the pdf, 𝕕, we can use the fundamental theorem of calculus to state 𝕕𝕕.
The pdf will always be:
Non-negative, as it is the derivative of a non-decreasing cdf.
𝕕.
4.1 Mean, Variance & Quantiles
For a CRV we can define the mean or expectation as 𝕕. More generally, we can say 𝕕 for some . Foe CRVs, we again have linearity and additivity of expectation:
.
.
The variance of a CRV is given by 𝕕.
Again, it is easy to show that 𝕕.
We also have .
The lower and upper quartiles and the median of a sample of data is defined as points -way through the ordered dataset, respectively. For a CRV , we define the -quartile for as the least number satisfying , or:
For example, the median of a CRV X is the solution to .
4.2 Continuous Uniform Distribution
A CRV with range has a uniform distribution on if , with pdf and cdf:
Suppose . Then for . Now suppose . Hence, , and .
The mean and variance are given by:
4.3 Exponential Distribution
The CRV is exponentially distributed with rate if , with pdf and cdf:
The mean and variance are:
4.3.1 Memoryless Property
The complementary cumuliative distribution function of an exponential distribution is . The memoryless property of the exponential distribution states that for any : .
When models time, this is the distribution of residual time before the event occurs. We acknowledge that we have waited time for the event, but this tells us nothing about how much longer we will have to wait - the process has no memory.
4.3.2 Link with Poisson
If events occur in a random process according to a Poisson distribution withrate , then the time between events is exponentially distributed with rate .
Suppose we have a random event process for which the number of events occuring in , , is modelled by . This is a homogenous poisson process. Then:
Let be the time until the first event occurs.
.
Hence, .
4.4 Normal Distribution
A normal or gaussian random variable with range , with mean and variance is denoted , with pdf and cdf:
4.4.1 Standard Normal
By setting and , we get the standard normal random variable , with simplified pdf and cdf:
Now, suppose . Then, . This allows us to standardise any Normal random variable:
We can also relate the cdf of a normal distribution to the cdf of a standard normal distribution with . Hence:
To inference values of , we use a table. To help with negative values, we use the symmetry of the normal distribution: . Hence .
4.5 Lognormal Distribution
Suppose , and consider the transformation . Then has a lognormal distribution with pdf (for ):
4.6 Moment Generating Function
The moment generating function of a crv is defined as:
The mgf can also be defined for a drv as:
Assuming differentiation inside the expectation operator is valid, the mgf provides an alternative way to obtain moments:
𝕕
4.6.1 Characteristic Function
The characteristic function is a modification of the mgf that exists for all random variables. It is defined as:
4.6.2 Combining Random Variables
Consider where and are two independent random variables. Soon, we will be able to show that . This result generalizes as .
A consequence of this is that the mgf of the sum of independent random variables is the product of their mgfs:
5. Inequalities
The markov inequality states that for any random variable that only takes nonnegative values:
The chebyshev inequality states that if is a random variable with mean and variance , then:
The chebyshev inequality can be proven using the markov inequality by defining a new random variable and applying the markov inequality to with .