Rare-event simulation

Patrick Laub

Lecture 1 (Part 1)

Agenda

Now:

  • Admin stuff
  • Introduction to rare events
  • Crude Monte Carlo recap

In the code:

  • Random number generation
  • Crude Monte Carlo estimation

Admin

Structure for each fortnight:

  • 2 hours of lectures
    • $\approx$ 30 mins lecture
    • $\approx$ 30 mins code demonstration
    • $\approx$ 30 mins lecture
    • $\approx$ 30 mins code demonstration
  • 1 hour of coding on homework

Assessment:

  • 3 $\times$ homework, done in pairs
  • Have 2 weeks for each (email results to me before class)
  • Final grade is the sum of these (+ exam for Nabil)

Who am I?

  • Patrick (Pat) Laub
  • Post-doc here
  • Background in Software Engineering and Mathematics
  • PhD in Applied Probability between Australia & Denmark
  • patrick.laub@gmail.com, office in 1405

Rare-event estimation

Calculating tiny probabilities using computers...

Classical problem is

$$ \ell = \mathbb{P}(X > \gamma) $$

for $\gamma \gg 0$. E.g. $\ell \in [10^{-4}, 10^{-15}]$.

Not large deviations

In rare-events the limit for $S_n = X_1 + \dots + X_n$ is like

$$ \lim_{\gamma \to \infty} \mathbb{P}(S_n > \gamma) $$

whereas in large deviations it is

$$ \lim_{n \to \infty} \mathbb{P}(S_n > n \gamma) \,. $$

Insurance applications

Aggregate losses $S_N = \sum_{i=1}^N X_i $. Want to know

$$ \mathbb{P}(S_N > \gamma) $$$$ \mathbb{E}[ S_N - \gamma \mid S_N > \gamma ] $$$$ \text{Value-at-Risk}_\alpha = \inf\{ x : \mathbb{P}(S_N < x) \ge 1 - \alpha \} $$

or ruin probabilities (finite or infinite time).

This isn't statistics

Data $\rightarrow$ Model $\rightarrow$ Results

$$ \mathbb{P}(\text{Event}) \approx \text{ Historical frequency of the event} $$

If a 'black swan', then you'd say $\mathbb{P}(\text{Event}) = 0$.

First option: do the math

If the distribution is simple,

$$ \ell = \mathbb{P}(X > \gamma) \qquad X \sim \mathrm{Pareto}(\alpha) $$

we can do some algebra to get the answer:

$$ \ell = \int_\gamma^\infty f_X(x) \, \mathrm{d}x = \int_\gamma^\infty \frac{\alpha}{x^{\alpha+1}} \mathrm{d}x = \Bigl[ - \frac{1}{x^{\alpha}} \Bigr]_\gamma^\infty = \frac{1}{\gamma^\alpha} \,. $$

Here, doesn't matter if $\gamma \gg 0$.

Backup option: Monte Carlo

Say that we want to estimate $\ell = \mathbb{P}(X > \gamma)$ for $X \sim F_X$ with a crude Monte Carlo estimate

$$ \hat{\ell} = \frac{1}{R} \sum_{r=1}^R 1\{X_r > \gamma\} \,, \qquad X_r \overset{\mathrm{i.i.d.}}{\sim} F_X \,.$$

Then we have $\mathbb{E}[ \hat{\ell} ] = \ell$, i.e., it is unbiased.

Say $S_N = \sum_{i=1}^N X_i$ is aggregate claims, and reinsurer covers losses over level $\gamma$.

Simulate $S_N^{(1)}, \dots, S_N^{(R)}$ which are i.i.d. copies of $S_N$, then:

$$ \begin{aligned} \mathbb{P}(\text{Reinsurer pays anything}) &= \mathbb{P}(S_N > \gamma) \\ &\approx \frac1R \sum_{r=1}^R 1\{ S_N^{(r)} > \gamma \} \end{aligned} $$$$ \begin{aligned} \mathbb{E}(\text{Reinsurer payout}) &= \mathbb{E}( (S_N - \gamma)_+ ) \\ &\approx \frac1R \sum_{r=1}^R (S_N^{(r)} - \gamma )_+ \end{aligned} $$$$ \text{Value-at-Risk}_\alpha \approx \text{Quantile}( \{ (S_N^{(r)} - \gamma )_+ \}_{r=1,\dots,R} , 1-\alpha) $$
$$ \mathbb{E}[ (S_T - K)_+ \mid S_0 ] $$
2 out of 1000 in the money
$$ \mathbb{E}[ (S_T - K)_+ 1 \{ \min_{t \in [0,T]} S_t < \gamma \} \mid S_0 ] $$
2 out of 100 in the money

Example: estimating $\pi$

Take $X, Y \overset{\mathrm{i.i.d.}}{\sim} \mathsf{Uniform}(0,1)$, and consider $\ell = \mathbb{P}( || (X,Y) || \le 1 ) = \mathbb{P}( X^2 + Y^2 \le 1 )$.

We can estimate it by $$ \hat{\ell} = \frac1R \sum_{r=1}^R 1 \{ X_r^2 + Y_r^2 \le 1\} \,. $$

Can also use a bit of logic to figure out that

$$ \ell = \frac{\text{Area of a quadrant of the unit circle}}{\text{Area of the } [0,1]^2 \text{ square}} = \frac{\pi}{4} .$$

So, $$ \hat{\ell} \approx \ell = \frac{\pi}{4} \quad \Leftrightarrow \quad \pi \approx 4 \cdot \hat{\ell} .$$

$R$ $10^1$ $10^2$ $10^3$ $10^4$ $10^5$ $10^6$ $10^7$ $10^8$ $10^9$
# Digits Confident 0 0 0 2 2 2 3 4 4

Confidence intervals

Have CLT for $\hat{\ell}(R) = \frac1R \sum_{r=1}^R 1\{X^{(r)} > \gamma \}$ as $R\to\infty$:

$$ \sqrt{R} \bigl( \hat{\ell}(R) - \ell \bigr) \overset{\mathcal{D}}{\longrightarrow} \mathsf{Normal}(0, \sigma^2) $$

where $\sigma^2 = \mathbb{V}\mathrm{ar}( 1\{ X > \gamma \} ) = \mathbb{V}\mathrm{ar}( \hat{\ell}(1) )$.

$$ \Rightarrow \hat{\ell}(R) \overset{\mathrm{approx.}}{\sim} \mathsf{Normal}\bigl(\ell, \frac{\sigma^2}{R}\bigr) .$$

So

$$ \hat{\ell}- q_{\alpha/2} \frac{\sigma}{\sqrt{R}} \le \ell \le \hat{\ell}+ q_{\alpha/2}\frac{\sigma}{\sqrt{R}} \quad \text{with probability } 1-\alpha , $$

where $q_{\alpha/2}$ is the quantile $\mathbb{P}(Z \le q_{\alpha/2}) = \frac{\alpha}{2}$ of $Z \sim \mathsf{Normal}(0,1)$. For $\alpha = 0.05$, we have $q_{\alpha/2} \approx 1.96$

Conclusions from confidence intervals

Great news, it performs the same regardless of dimension!

Terrible news, it takes a bazillion samples to make it accurate.