Uncertainty¶
Types of Uncertainty¶
Others’ knowledge Our knowledge | Known | Unknown |
---|---|---|
Known | Things we are certain of | We know there are things we can’t predict eg: Random Process |
Unknown | Others know but you don’t know eg: Insufficient data | Completely unexpected/unforeseeable events eg: Unknown distribution |
Aleatoric | Epistemic | |
---|---|---|
Uncertainty in | Data | Model |
Cause | Noisy input data Measurement errors | Missing training data |
Describes confidence in | Input data | Prediction |
Reducible through more training data | ❌ | ✅ |
Can be learnt by model | ✅ | ❌ |
Solution | Better instruments/measurements | Get more data |
Uncertainty Intervals¶
You can obtain uncertainty using
Concept | Limitations | |
---|---|---|
Asymptotic approach | Central limit theorem | - Requires large sample size to satisfy asymptotic condition - Assumes normal distribution of errors - Assumes homoscedascity - Requires appropriate formula for calculating standard error (not possible for complex models) |
Bootstrapping (preferred) | Random sampling with replacement | Higher computation cost |
Coefficient Confidence Interval | Response Confidence Interval | Response Prediction Interval | |
---|---|---|---|
Denotation | \(\sigma_{\hat \beta}\) | \(\sigma\Big(\hat \mu(y \vert x) \Big)\) | \(\sigma\Big( \hat y \vert x \Big)\) |
The upper and lower bound for estimated __ at a given level of significance | \(\hat \beta\) | \(\hat \mu(y \vert x)\) | \(\hat y \vert x\) |
SE (Standard Error) for Univariate Regression (Asymptotic Approach) | \(\dfrac{\text{RMSE}}{\sqrt{\sum (x_{\text{pred}_\text{cent}} )^2}}\) | \(\text{RMSE} \times \sqrt{ \dfrac{1}{n} + \dfrac{(x_{\text{pred}_\text{cent}} )^2}{n \sigma_x^2}}\) | \(\text{RMSE} \times \sqrt{ \textcolor{hotpink}{1 +} \dfrac{1}{n} + \dfrac{(x_{\text{pred}_\text{cent}})^2}{n \sigma_x^2}}\) |
SE (Standard Error) for Multivariate Regression (Asymptotic Approach) | \({\text{CovMatrix}_\beta}_{ii}\) | \(\text{RMSE} \times \sqrt{\dfrac{1}{n} + J'_{{x_\text{pred}}_\text{cent}} \ \text{CovMatrix}_{X} \ J'_{{x_\text{pred}}_\text{cent}}}\) | \(\text{RMSE} \times \sqrt{\textcolor{hotpink}{1+} \dfrac{1}{n} + x'_{\text{pred}_\text{cent}} \ \text{CovMatrix}_{X} \ x_{\text{pred}_\text{cent}}}\) |
Where
- \(J\): Jacobean matrix
- \(H\): Hessian matrix
High values for non-diagonal elements of \(\text{Cov}_\beta\) means that the errors of \(\beta\) are correlated with each other.
Degree of freedom \(= n - k - 1\), where
- \(n =\) sample size
- \(k=\) no of input variables
Confidence and prediction intervals are narrowest at \(X = \bar X\), and get wider further from this point.
Under homoskedasticity, $$ \begin{aligned} \hat V(\hat \beta) &= (X' X)^{-1} \hat \sigma^2 \ &=\dfrac{\hat \sigma^2}{\hat u_j' \hat u_j} \end{aligned} $$
Note¶
- RMSE = RMSE of validation data
- If your validation error distribution is not normal, or you have a lot of data, you can use the quantiles of validation error distribution for the confidence intervals
Intervals using Models’ Prediction¶
For each data point, take __ of multiple models
- average
- 5th quantile
- 95th quantile
Predictive Density¶
Describes the full probabilistic distribution \(\forall x\)
Trajectories/Scena rios¶
Equally-likely samples of multivariate predictive densities
Uncertainty Propagation¶
This table shows the variances and standard deviations of simple functions of the real variables \(A,B\!\), with standard deviations \(\sigma_A, \sigma_B,\,\) covariance \(\sigma_{AB}=\rho_{AB}\sigma_A\sigma_B\,\), and correlation \(\rho_{AB}\). The real-valued coefficients \(a\) and \(b\) are assumed exactly known (deterministic), i.e., \(\sigma_a = \sigma_b = 0\).
In the columns "Variance" and "Standard Deviation", A and B should be understood as expectation values (i.e. values around which we're estimating the uncertainty), and \(f\) should be understood as the value of the function calculated at the expectation value of \(A,B\!\).
Function | Variance |
---|---|
\(aA\) | \(= a^2\sigma_A^2\) |
\(aA + bB\) | \(= a^2\sigma_A^2 + b^2\sigma_B^2 + 2ab\,\sigma_{AB}\) |
\(aA - bB\) | \(= a^2\sigma_A^2 + b^2\sigma_B^2 - 2ab\,\sigma_{AB}\) |
\(AB\) | \(\approx f^2 \left[\left(\frac{\sigma_A}{A}\right)^2 + \left(\frac{\sigma_B}{B}\right)^2 + 2\frac{\sigma_{AB}}{AB} \right]\)[1][2] |
\(\frac{A}{B}\) | \(\approx f^2 \left[\left(\frac{\sigma_A}{A}\right)^2 + \left(\frac{\sigma_B}{B}\right)^2 - 2\frac{\sigma_{AB}}{AB} \right]\)[^3] |
\(\frac{A}{A+B}\) | \(\approx \frac{f^2}{\left(A+B\right)^2} \left(\frac{B^2}{A^2}\sigma_A^2 +\sigma_B^2 - 2\frac{B}{A} \sigma_{AB} \right)\) |
\(a A^{b}\) | \(\approx \left( {a}{b}{A}^{b-1}{\sigma_A} \right)^2 = \left( \frac{{f}{b}{\sigma_A}}{A} \right)^2\) |
\(a \ln(bA)\) | \(\approx \left(a \frac{\sigma_A}{A} \right)^2\)[^4] |
\(a \log_{10}(bA)\) | \(\approx \left(a \frac{\sigma_A}{A \ln(10)} \right)^2\)[^5] |
\(a e^{bA}\) | \(\approx f^2 \left( b\sigma_A \right)^2\)[^6] |
\(a^{bA}\) | \(\approx f^2 (b\ln(a)\sigma_A)^2\) |
\(a \sin(bA)\) | \(\approx \left[ a b \cos(b A) \sigma_A \right]^2\) |
\(a \cos \left( b A \right)\) | \(\approx \left[ a b \sin(b A) \sigma_A \right]^2\) |
\(a \tan \left( b A \right)\) | \(\left[ a b \sec^2(b A) \sigma_A \right]^2\) |
\(A^B\) | \(\approx f^2 \left[ \left( \frac{B}{A}\sigma_A \right)^2 +\left( \ln(A)\sigma_B \right)^2 + 2 \frac{B \ln(A)}{A} \sigma_{AB} \right]\) |
\(\sqrt{aA^2 \pm bB^2}\) | \(\approx \left(\frac{A}{f}\right)^2 a^2\sigma_A^2 + \left(\frac{B}{f}\right)^2 b^2\sigma_B^2 \pm 2ab\frac{AB}{f^2}\,\sigma_{AB}\) |
For uncorrelated variables (\(\rho_{AB}=0\), \(\sigma_{AB}=0\)) expressions for more complicated functions can be derived by combining simpler functions. For example, repeated multiplication, assuming no correlation, gives \(f = ABC; \qquad \left(\frac{\sigma_f}{f}\right)^2 \approx \left(\frac{\sigma_A}{A}\right)^2 + \left(\frac{\sigma_B}{B}\right)^2+ \left(\frac{\sigma_C}{C}\right)^2.\)
For the case \(f = AB\) we also have Goodman's expression[^7] for the exact variance: for the uncorrelated case it is \(V(XY)= E(X)^2 V(Y) + E(Y)^2 V(X) + E((X-E(X))^2 (Y-E(Y))^2)\) and therefore we have: \(\sigma_f^2 = A^2\sigma_B^2 + B^2\sigma_A^2 + \sigma_A^2\sigma_B^2\)
Effect of correlation on differences [effect_of_correlation_on_differences]¶
If A and B are uncorrelated, their difference A-B will have more variance than either of them. An increasing positive correlation (\(\rho_{AB}\to 1\)) will decrease the variance of the difference, converging to zero variance for perfectly correlated variables with the same variance. On the other hand, a negative correlation (\(\rho_{AB}\to -1\)) will further increase the variance of the difference, compared to the uncorrelated case.
For example, the self-subtraction f=A-A has zero variance \(\sigma_f^2=0\) only if the variate is perfectly autocorrelated (\(\rho_A=1\)). If A is uncorrelated, \(\rho_A=0\), then the output variance is twice the input variance, \(\sigma_f^2=2\sigma^2_A\). And if A is perfectly anticorrelated, \(\rho_A=-1\), then the input variance is quadrupled in the output, \(\sigma_f^2=4\sigma^2_A\) (notice \(1-\rho_A=2\) for f = aA − aA in the table above).
Value at Risk Models¶
- Derive the risk profile of the firm
- Protect firm against unacceptably large concentrations
- Quantify potential losses
- Collect data
- Graph the data to inspect data quality
- Transform prices data into returns form (percentage diff of prices)
- Look at the frequency distribution
- Obtain the standard deviation (volatility)
- Multiply volatility with one-sided \(Z_1\) to estimate 99% worst-case loss