Monday, April 17, 2017

Financial Risk Manager FRM Exam Part I Quantitative Analysis GARP

http://www.freeaccountingbooks.com/financial-risk-manager-frm-exam-part-i-quantitative-analysis-garp/

Free Accounting books provides unlimited PDF books, notes & guides of #Financial, #cost and #management #accounting, #ACCA, #CFA, #CIMA, #CPA, #CMA, #FRM and many more .......
#freeaccountingbooks.com 
CHAPTER 1 PROBABILITIES 3
Standardized Variables 18
Covariance 19
Discrete Random Variables 4 Correlation 19
Continuous Random Variables 4 Application: Portfolio Variance
Probability Density Functions 4 and Hedging 20
Cumulative Distribution Functions 5 Moments 21
Inverse Cumulative Distribution
Functions 6 Skewness 21
Mutually Exclusive Events 7 Kurtosis 23
Independent Events 7 Coskewness and Cokurtosis 24
Probability Matrices 8 Best Linear Unbiased
Conditional Probability 8
Estimator (BLUE) 26
CHAPTER 2 BASIC STATISTICS 11 CHAPTER 3 DISTRIBUTIONS 29
Averages 12 Parametric Distributions 30
Population and Sample Data 12 Uniform Distribution 30
Discrete Random Variables 13 Bernoulli Distribution 31
Continuous Random Variables 13
Expectations 14 Binomial Distribution 31
Variance and Standard Deviation 17 Poisson Distribution 33
iii
Normal Distribution 34 Which Way to Test? 60
One Tailor Two? 61
Lognormal Distribution 36 The Confidence Level Returns 61
Central Limit Theorem 36 Chebyshev's Inequality 62
Application: Monte Application: VaR 62
Carlo Simulations
Backtesting 64 Part 1: Creating Normal
Random Variables 38 Subadditivity 65
Expected Shortfall 66
Chi-Squared Distribution 39
Student's t Distribution 39 CHAPTER 6 CORRELATIONS
F-Distribution 40 AND COPULAS 69
Triangular Distribution 41
Definition of Correlation 70
Beta Distribution 42
Correlation vs. Dependence 70
Mixture Distributions 42 Monitoring Correlation 71
EWMA 71
CHAPTER4 BAVESIAN ANALVSIS 47 GARCH 72
Consistency Condition
for Covariances 72
Overview 48
Multivariate Normal Distributions 73
Bayes' Theorem 48 Generating Random Samples
from Normal Distributions 73
Bayes versus Frequentists 51 Factor Models 73
Many-State Problems 52 Copulas 74
Expressing the Approach
Algebraically 76
CHAPTER 5 HVPOTHESIS TESTING Other Copulas 76
AND CONFIDENCE Tail Dependence 76
INTERVALS 57 Multivariate Copulas 77
A Factor Copula Model 77
Sample Mean Revisited 58 Application to Loan Portfolios:
Vasicek's Model 78
Sample Variance Revisited 59 Estimating PO and p 79
Confidence Intervals 59 Alternatives to the Gaussian Copula 80
Hypothesis Testing 60 Summary 80
iv • Contents
CHAPTER 7 LINEAR REGRESSION
WITH ONE REGRESSOR 83
The Linear Regression Model 84
Estimating the Coefficients
of the Linear Regression Model 86
The Ordinary Least
Squares Estimator 87
OLS Estimates of the Relationship
Between Test Scores and the
Student-Teacher Ratio 88
Why Use the OLS Estimator? 89
Measures of Fit 90
The R2 90
The Standard Error of the Regression 91
Application to the Test Score Data 91
The Least Squares Assumptions 92
Assumption #1: The Conditional
Distribution of ui Given Xi Has
a Mean of Zero 92
Assumption #2: (Xi' ~), i = 1, ... , n
Are Independently and
Identically Distributed 93
Assumption #3: Large Outliers
Are Unlikely 94
Use of the Least Squares
Assumptions 95
Sampling Distribution
of the OLS Estimators 95
The Sampling Distribution
of the OLS Estimators 95
Conclusion 97
Summary 97
Appendix A 98
The California Test Score Data Set 98
Appendix B
Derivation of the OLS Estimators
98
98
CHAPTER 8 REGRESSION WITH A
SINGLE REGRESSOR 101
Testing Hypotheses about
One of the Regression
Coefficients 102
Two-Sided Hypotheses Concerning 131 102
One-Sided Hypotheses Concerning 131 104
Testing Hypotheses about
the Intercept 130 105
Confidence Intervals
for a Regression Coefficient 105
Regression When X
Is a Binary Variable 107
Interpretation of the Regression
Coefficients 107
Heteroskedasticity
and Homoskedasticity 108
What Are Heteroskedasticity
and Homoskedasticity? 108
Mathematical Implications
of Homoskedasticity 109
What Does This Mean in Practice? 110
The Theoretical Foundations
of Ordinary Least Squares 111
Linear Conditionally Unbiased
Estimators and the Gauss-Markov
Theorem 112
Regression Estimators Other
than OLS 112
Using the t-Statistic
in Regression When
the Sample Size Is Small 113
The t-Statistic and the Student
t Distribution 113
Use of the Student t Distribution
in Practice 114
Conclusion 114
Contents II v
----------_._-------- -.----.. --
Summary 115 The Least Squares Assumptions
in Multiple Regression 129
Appendix 115 Assumption #1: The Conditional
The Gauss-Markov Conditions and Distribution of u/ Given
a Proof of the Gauss-Markov Xli' X2i, ... , Xkl Has a Mean of Zero 129
Theorem 115
Assumption #2: (Xli' X2/, ... ,Xkl , Y,),
The Gauss-Markov Conditions 115 i = 1, ... , n Are i.i.d. 129
The Sample Average Is the Efficient Assumption #3: Large Outliers
Linear Estimator of E(y) 116 Are Unlikely 129
Assumption #4: No Perfect
Multicollinearity 129
CHAPTER 9 LINEAR REGRESSION
WITH MULTIPLE The Distribution of the OLS
Estimators in Multiple
REGRESSORS 119 Regression 130
Multicollinearity 131
Omitted Variable Bias 120 Examples of Perfect Multicollinearity 131
Definition of Omitted Variable Bias 120
Imperfect Multicollinearity 132
A Formula for Omitted Variable Bias 121
Addressing Omitted Variable Conclusion 133
Bias by Dividing the Data
into Groups 122 Summary 133
The Multiple Regression Model 124
The Population Regression Line 124 CHAPTER 10 HYPOTHESIS TESTS
The Population Multiple AND CONFIDENCE
Regression Model 124
INTERVALS IN MULTIPLE
The OLS Estimator in Multiple REGRESSION 137
Regression 126
The OLS Estimator 126
Application to Test Scores Hypothesis Tests and Confidence
and the Student-Teacher Ratio 126 Intervals for a Single Coefficient 138
Standard Errors for the OLS
Measures of Fit in Multiple Estimators 138
Regression 127 Hypothesis Tests for a Single
The Standard Error Coefficient 138
of the Regression (SER) 127 Confidence Intervals
TheR2 128 for a Single Coefficient 139
The "Adjusted R2" 128 Application to Test Scores
Application to Test Scores 128 and the Student-Teacher Ratio 139
vi • Contents
Tests of Joint Hypotheses 140
CHAPTER 12 CHARACTERIZING
Testing Hypotheses on Two
or More Coefficients 140 CYCLES 161
The F-Statistic 142
Application to Test Scores Covariance Stationary
and the Student-Teacher Ratio 143 Time Series 162
The Homoskedasticity-Only
F-Statistic 143 White Noise 165
Testing Single Restrictions The Lag Operator 168
Involving Multiple Coefficients 144
Wold's Theorem, the General
Confidence Sets for Multiple Linear Process, and Rational
Coefficients 145 Distributed Lags 168
Model Specification Wold's Theorem 168
for Multiple Regression 146 Theorem 169
Omitted Variable Bias The General Linear Process 169
in Multiple Regression 146 Rational Distributed Lags 170
Model Specification in Theory Estimation and Inference and in Practice 147
for the Mean, Autocorrelation,
Interpreting the R2 and the Adjusted R2 in Practice 147 and Partial Autocorrelation
Functions 170
Analysis of the Test Score Sample Mean 170
Data Set 148 Sample Autocorrelations 170
Conclusion 151 Sample Partial Autocorrelations 172
Summary 151 Application: Characterizing
Canadian Employment
Appendix 152 Dynamics 172
The Bonferroni Test
of a Joint Hypothesis 152
CHAPTER 13 MODELING CYCLES:
CHAPTER 11 MODELING AND
MA, AR, AND
ARMAMoDELS 177
FORECASTING
TREND 155
Moving Average (MA) Models 178
The MA(l) Process 178
Selecting Forecasting Models The MA(q) Process 181
Using the Akaike
and Schwarz Criteria 156 Autoregressive (AR) Models 182
The AR(l) Process 182
The AR(p) Process 184
Contents • vii
Autoregressive Moving Average
(ARMA) Models 186 CHAPTER 15 SIMULATION
MODELING 213
Application: Specifying and
Estimating Models for
Employment Forecasting 187 Monte Carlo Simulation:
A Simple Example 214
Selecting Probability Distributions
CHAPTER 14 ESTIMATING for the Inputs 215
VOLATILITIES Interpreting Monte Carlo
Simulation Output 215
AND CORRELATIONS 197
Why Use Simulation? 217
Estimating Volatility 198
Multiple Input Variables and
Compounding Distributions 218
Weighting Schemes 198 Incorporating Correlations 218
The Exponentially Weighted Evaluating Decisions 219
Moving Average Model 199 Important Questions
The GARCH(l, 1) Model 200 in Simulation Modeling 221
The Weights 201 How Many Scenarios? 221
Mean Reversion 201 Estimator Bias 221
Estimator Efficiency 222
Choosing Between the Models 201
Random Number Generation 222
Maximum Likelihood Methods 201 Inverse Transform Method 223
Estimating a Constant Variance 201 What Defines a "Good" Random
Estimating EWMA or GARCH(l. 1) Number Generator? 224
Parameters 202 Pseudorandom Number Generators 225
How Good Is the Model? 204 Quasirandom (Low-Discrepancy)
Using GARCH(l, 1) to Forecast Sequences 226
Future Volatility 205 Stratified Sampling 226
Volatility Term Structures 205 Summary 228
Impact of Volatility Changes 206
Sample Exam Questions-
Correlations 206 Quantitative Analysis 231
Consistency Condition
for Covariances 207 Sample Exam Answers
and Explanations-
Application of EWMA Quantitative Analysis 235
to Four-Index Example 208
Appendix Table 1 239
Summary 209
Index 241
viii III Contents
2015 FRM COMMITTEE MEMBERS
Dr. Rene Stulz (Chairman)
Ohio State University
Richard Apostolik
Global Association of Risk Professionals
Richard Brandt
Citibank
Dr. Christopher Donohue
Global Association of Risk Professionals
Herve Geny
London Stock Exchange
Keith Isaac, FRM®
TD Bank
Steve Lerit, CFA
UBS Wealth Management
William May
Global Association of Risk Professionals
Michelle McCarthy
Nuveen Investments
Dr. Victor Ng
Goldman Sachs & Co
Dr. Elliot Noma
Garrett Asset Management
Dr. Matthew Pritsker
Federal Reserve Bank of Boston
Liu Ruixia
Industrial and Commercial Bank of China
Dr. Til Schuermann
Oliver Wyman
Nick Strange
Bank of England, Prudential Regulation Authority
Serge Sverdlov
Redmond Analytics
Alan Weindorf
Visa
ix

• Learning Objectives
Candidates, after completing this reading, should be
able to:
• Describe and distinguish between continuous and
discrete random variables.
• Define and distinguish between the probability
density function, the cumulative distribution
function, and the inverse cumulative distribution
function.
• Calculate the probability of an event given a discrete
probability function.
• Distinguish between independent and mutually
exclusive events.
• Define joint probability, describe a probability matrix,
and calculate joint probabilities using probability
matrices.
• Define and calculate a conditional probability, and
distinguish between conditional and unconditional
probabilities.
Excerpt is Chapter 2 of Mathematics and Statistics for Financial Risk Management, Second Edition, by Michael B. Miller.
3
In this chapter we explore the application of probabilities
to risk management. We also introduce basic terminology
and notations that will be used throughout the rest of
this book.
DISCRETE RANDOM VARIABLES
The concept of probability is central to risk management.
Many concepts associated with probability are deceptively
simple. The basics are easy, but there are many
potential pitfalls.
In this chapter, we will be working with both discrete
and continuous random variables. Discrete random variables
can take on only a countable number of values-for
example, a coin, which can be only heads or tails, or a
bond, which can have only one of several letter ratings
(AAA, AA, A, BBB, etc.). Assume we have a discrete random
variable X, which can take various values, xi' Further
assume that the probability of any given Xi occurring is Pi
We write:
P[X = Xt] = Pi s.t. Xi E {X" X2' ••• , X) (1.1)
where P[·] is our probability operator.'
An important property of a random variable is that the
sum of all the probabilities must equal one. In other
words, the probability of any event occurring must equal
one. Something has to happen. Using our current notation,
we have:
n LA =1 (1.2)
i=;
CONTINUOUS RANDOM VARIABLES
In contrast to a discrete random variable, a continuous
random variable can take on any value within a given
range. A good example of a continuous random variable is
the return of a stock index. If the level of the index can be
any real number between zero and infinity, then the return
of the index can be any real number greater than -1.
Even if the range that the continuous variable occupies is
finite, the number of values that it can take is infinite. For
, "s.t." is shorthand for "such that". The final term indicates that x.
is a member of a set that includes n possible values, xl' x2' ••• , x~.
You could read the full equation as: "The probability that X equals
Xi is equal to Pi' such that Xi is a member of the set xl' x2' to xn."
this reason, for a continuous variable, the probability of
any specific value occurring is zero.
Even though we cannot talk about the probability of a
specific value occurring, we can talk about the probability
of a variable being within a certain range. Take, for
example, the return on a stock market index over the next
year. We can talk about the probability of the index return
being between 6% and 7%, but talking about the probability
of the return being exactly 6.001 % is meaningless.
Between 6% and 7% there are an infinite number of possible
values. The probability of anyone of those infinite
values occurring is zero.
For a continuous random variable X, then, we can write:
perl < X < r) = P (1.3)
which states that the probability of our random variable,
X, being between r, and r2 is equal to p.
Probability Density Functions
For a continuous random variable, the probability of a
specific event occurring is not well defined, but some
events are still more likely to occur than others. Using
annual stock market returns as an example, if we look at
50 years of data, we might notice that there are more
data points between 0% and 10% than there are between
10% and 20%. That is, the density of points between 0%
and 10% is higher than the density of points between 10%
and 20%.
For a continuous random variable we can define a probability
density function (PDF), which tells us the likelihood
of outcomes occurring between any two points. Given
our random variable, X, with a probability P of being
between r, and r2, we can define our density function, f(x),
such that:
f f(x)dx = p (1.4)
The probability density function is often referred to as the
probability distribution function. Both terms are correct,
and, conveniently, both can be abbreviated PDF.
As with discrete random variables, the probability of any
value occurring must be one:
'T f(x)dx = 1 (1.5)
where r min and r max define the lower and upper bounds of f(x).
4 III Financial Risk Manager Exam Part I: Quantitative Analysis
Example 1.1
Question:
Define the probability density function for the price of a
zero coupon bond with a notional value of $10 as:
x f(x) = - s.t. 0 :s; x :s; 10
50
where x is the price of the bond. What is the probability
that the price of the bond is between $8 and $9?
Answer:
First, note that this is a legitimate probability function. By
integrating the PDF from its minimum to its maximum,
we can show that the probability of any value occurring is
indeed one:
lax 1 10 1[1]10 1
f -dx = -fxdx =- _x2 = -(102 - 0 2 ) = 1
a 50 50 a 50 2 a 1 00
If we graph the function, as in Figure 1-1, we can also
see that the area under the curve is one. Using simple
geometry:
Area of triangle = ~ . Base· Height = ~ . 10 . 0.2 = 1
0.2
:..g.. •. 1
I 'i@i);,," Probability density function.
I
To answer the question, we simply integrate the probability
density function between 8 and 9:
f9 X [1 2]9 1 2 2 17 0
8 50 dx = 100 x 8 = 100(9 - 8 ) = 100 = 17%
The probability of the price ending up between $8 and $9
is 17%.
Cumulative Distribution Functions
Closely related to the concept of a probability density
function is the concept of a cumulative distribution function
or cumulative density function (both abbreviated
CDF). A cumulative distribution function tells us the
probability of a random variable being less than a certain
value. The CDF can be found by integrating the probability
density function from its lower bound. Traditionally, the
cumulative distribution function is denoted by the capital
letter of the corresponding density function. For a random
variable X with a probability density function f(x), then,
the cumulative distribution function, F(x), could be calculated
as follows:
10
a
F(a) = f f(x)dx = P[X :s; a] (1.6)
As illustrated in Figure 1-2, the cumulative distribution
function corresponds to the area under
the probability density function, to the left of a.
By definition, the cumulative distribution function
varies from 0 to 1 and is nondecreasing. At
the minimum value of the probability density
function, the CDF must be zero. There is no
probability of the variable being less than the
minimum. At the other end, all values are less
than the maximum of the PDF. The probability
is 100% (CDF = 1) that the random variable
will be less than or equal to the maximum. In
between, the function is nondecreasing. The
reason that the CDF is nondecreasing is that, at
a minimum, the probability of a random variable
being between two points is zero. If the CDF
of a random variable at 5 is 50%, then the lowest
it could be at 6 is 50%, which would imply
0% probability of finding the variable between
5 and 6. There is no way the CDF at 6 could be
less than the CDF at 5.
Chapter 1 Probabilities • 5
0.2
-.
-.!:S. 0.1
() 2 10
Example 1.2
Question:
Calculate the cumulative distribution function
for the probability density function from the
previous problem:
x f(x) = - S.t. a :0; x :0; 10
50
(1.10)
Then answer the previous problem: What is
the probability that the price of the bond is
between $8 and $9?
Answer:
The CDF can be found by integrating the PDF:
a a X
F(a) = f f(x)dx = f- dx
o 0 50
lii@iJ;ji'J Relationship between cumulative distribution
function and probability density.
1 a 1 [1 2Ja =-fxdx=- -x
50 0 50 2 0 100
To get the answer to the question, we simply
Just as we can get the cumulative distribution from the evaluate the CDF at $8 and $9 and subtract:
probability density function by integrating, we can get
the PDF from the CDF by taking the first derivative of
the CDF:
f(x) = dF(x)
dx
(1.7)
That the CDF is nondecreasing is another way of saying
that the PDF cannot be negative.
If instead of wanting to know the probability that a random
variable is less than a certain value, what if we want
to know the probability that it is greater than a certain
value, or between two values? We can handle both cases
by adding and subtracting cumulative distribution functions.
To find the probability that a variable is between
two values, a and b, assuming b is greater than a, we
subtract:
b
PEa < X :0; b] = f f(x)dx = F(b) - F(a) (1.8)
To get the probability that a variable is greater than a certain
value, we simply subtract from 1:
P[X> a] = 1 - F(a) (1.9)
This result can be obtained by substituting infinity for b in
the previous equation, remembering that the CDF at infinity
must be 1.
P[$8 < x :0; $9J = F(9) - F(8)
=~_~ =~_ 64 =~ = 17%
100 100 100 100 100
As before, the probability of the price ending up between
$8 and $9 is 17%.
Inverse Cumulative Distribution
Functions
The inverse of the cumulative distribution can also be
useful. For example, we might want to know that there is
a 5% probability that a given equity index will return less
than -10.6%, or that there is a 1 % probability of interest
rates increasing by more than 2% over a month.
More formally, if F(a) is a cumulative distribution function,
then we define F-'(p), the inverse cumulative distribution,
as follows:
F(a) = p ¢::} F-'(p) = a S.t. a :0; p :0; 1 (1.11)
As we will see in Chapter 3, while some popular distributions
have very simple inverse cumulative distribution
functions, for other distributions no explicit
inverse exists.
6 II Financial Risk Manager Exam Part I: Quantitative Analysis
Example 1.3
Question:
Given the cumulative distribution from the previous sample
problem:
a2
F(a) = - s.t. 0 ~ a ~ 10
100
Calculate the inverse cumulative distribution function.
Find the value of a such that 25% of the distribution is less
than or equal to a.
Answer:
We have:
Solving for p:
a2
F(a)=p=-
100
a = 10/P
Therefore, the inverse CDF is:
F-1(p) = 1O/P
We can quickly check that p = 0 and p = 1, return 0 and
10, the minimum and maximum of the distribution. For
p = 25% we have:
F-1(025) = 10.j025 = 10· 0.5 = 5
So 25% of the distribution is less than or equal to 5.
MUTUALLY EXCLUSIVE EVENTS
For a given random variable, the probability of any of two
mutually exclusive events occurring is just the sum of their
individual probabilities. In statistics notation, we can write:
PEA U a] = PEA] + PEa] (1.12)
where [A u a] is the union of A and B. This is the probability
of either A or a occurring. This is true only of mutually
exclusive events.
This is a very simple rule, but, as mentioned at the beginning
of the chapter, probability can be deceptively simple,
and this property is easy to confuse. The confusion stems
from the fact that and is synonymous with addition. If you
say it this way, then the probability that A or a occurs is
equal to the probability of A and the probability of a. It is
not terribly difficult, but you can see where this could lead
to a mistake.
This property of mutually exclusive events can be
extended to any number of events. The probability that
any of n mutually exclusive events occurs is simply the
sum of the probabilities of those n events.
Example 1.4
Question:
Calculate the probability that a stock return is either
below -10% or above 10%, given:
Answer:
PER < -10%] = 14%
PER > +10%] = 17%
Note that the two events are mutually exclusive; the return
cannot be below -10% and above 10% at the same time.
The answer is: 14% + 17% = 31 %.
INDEPENDENT EVENTS
In the preceding example, we were talking about one
random variable and two mutually exclusive events, but
what happens when we have more than one random variable?
What is the probability that it rains tomorrow and
the return on stock XYZ is greater than 5%? The answer
depends crucially on whether the two random variables
influence each other. If the outcome of one random variable
is not influenced by the outcome of the other random
variable, then we say those variables are independent. If
stock market returns are independent of the weather, then
the stock market should be just as likely to be up on rainy
days as it is on sunny days.
Assuming that the stock market and the weather are
independent random variables, then the probability of the
market being up and rain is just the product of the probabilities
of the two events occurring individually. We can
write this as follows:
P[rain and market up] = P[rain n market up]
= P[rain] . P[market up] (1.13)
We often refer to the probability of two events occurring
together as their joint probability.
Example 1.5
Question:
According to the most recent weather forecast, there is a
20% chance of rain tomorrow. The probability that stock
Chapter .1 Probabilities • 7
XYZ returns more than 5% on any given day is 40%. The
two events are independent. What is the probability that
it rains and stock XYZ returns more than 5% tomorrow?
Answer:
Since the two events are independent, the probability
that it rains and stock XYZ returns more than 5% is just
the product of the two probabilities. The answer is: 20% x
40% = 8%.
PROBABILITY MATRICES
When dealing with the joint probabilities of two variables, it
is often convenient to summarize the various probabilities
in a probability matrix or probability table. For example,
pretend we are investigating a company that has issued
both bonds and stock. The bonds can be downgraded,
upgraded, or have no change in rating. The stock can either
outperform the market or underperform the market.
In Figure 1-3, the probability of both the company's stock
outperforming the market and the bonds being upgraded
is 15%. Similarly, the probability of the stock underperforming
the market and the bonds having no change in
rating is 25%. We can also see the unconditional probabilities,
by adding across a row or down a column. The
probability of the bonds being upgraded, irrespective of
the stock's performance, is: 15% + 5% = 20%. Similarly,
the probability of the equity outperforming the market is:
15% + 30% + 5% = 50%. Importantly, all of the joint probabilities
add to 100%. Given all the possible events, one of
them must happen.
Example 1.6
Question:
You are investigating a second company. As with our previous
example, the company has issued both bonds and
20%
55%
25%
50% 50% 100%
FIGURE 1-3 Bonds versus stock matrix.
5%
Z
35%
50% 50% 100%
FIGURE 1-4 Bonds versus stock matrix.
stock. The bonds can be downgraded, upgraded, or have
no change in rating. The stock can either outperform the
market or underperform the market. You are given the
probability matrix shown in Figure 1-4, which is missing
three probabilities, X, Y, and Z. Calculate values for the
missing probabilities.
Answer:
All of the values in the first column must add to 50%, the
probability of the stock outperforming the market; therefore,
we have:
5% + 40% + X = 50%
X=5%
We can check our answer for X by summing across the
third row: 5% + 30% = 35%.
Looking down the second column, we see that Y is equal
to 20%:
0% + Y + 30% = 50%
Y=20%
Finally, knowing that Y = 20%, we can sum across the second
row to get Z:
40% + Y = 40% + 20% = Z
Z=60%
CONDITIONAL PROBABILITY
The concept of independence is closely related to the
concept of conditional probability. Rather than trying to
determine the probability of the market being up and
having rain, we can ask, "What is the probability that the
stock market is up given that it is raining?" We can write
this as a conditional probability:
P[market up I rain] = p (1.14)
8 III Financial Risk Manager Exam Part I: Quantitative Analysis
The vertical bar signals that the probability of the first
argument is conditional on the second. You would read
Equation (1.14) as "The probability of 'market up' given
'rain' is equal to p."
Using the conditional probability, we can calculate the
probability that it will rain and that the market will be up.
p[market up and rain] = P[market up I rain] . P[rain] (1.15)
For example, if there is a 10% probability that it will rain
tomorrow and the probability that the market will be up
given that it is raining is 40%, then the probability of rain
and the market being up is 4%: 40% x 10% = 4%.
From a statistics standpoint, it is just as valid to calculate
the probability that it will rain and that the market will be
up as follows:
P[market up and rain] = P[raih I market up]
. P[market up] (1.16)
As we will see in Chapter 4 when we discuss Bayesian
analysis, even though the right-hand sides of Equations
(1.15) and 0.16) are mathematically equivalent, how
we interpret them can often be different.
We can also use conditional probabilities to calculate
unconditional probabilities. On any given day, either it
rains or it does not rain. The probability that the market
will be up, then, is simply the probability of the market
being up when it is raining plus the probability of the market
being up when it is not raining. We have:
P[market up] = P[market up and rain]
+ P[market up and rain]
P[market up] = P[market up I rain] . P[rain]
+ P[market up I rain] . P[rain] (1.17)
Here we have used a line over rain to signify logical negation;
rain can be read as "not rain."
In general, if a random variable X has n possible values, Xl'
x2' ••• ,xn' then the unconditional probability of Y can be
calculated as:
n
P[Y] = L P[Y I x)P[x) (1.18)
;=1
If the probability of the market being up on a rainy day
is the same as the probability of the market being up on
a day with no rain, then we say that the market is conditionally
independent of rain. If the market is conditionally
independent of rain, then the probability that the market
is up given that it is raining must be equal to the unconditional
probability of the market being up. To see why
this is true, we replace the conditional probability of the
market being up given no rain with the conditional probability
of the market being up given rain in Equation (1.17)
(we can do this because we are assuming that these two
conditional probabilities are equal).
P[market up] = P[market up I rain] . P[rain]
+ P[market up I rain] . P[rain]
P[market up] = P[market up I rain] . (P[rain] + P[rainD
P[market up] = P[market up I rain] (1.19)
In the last line of Equation (1.19), we rely on the fact that
the probability of rain plus the probability of no rain is
equal to one. Either it rains or it does not rain.
In Equation (1.19) we could just have easily replaced the
conditional probability of the market being up given rain
with the conditional probability of the market being up
given no rain. If the market is conditionally independent of
rain, then it is also true that the probability that the market
is up given that it is not raining must be equal to the
unconditional probability of the market being up:
P[market up] = P[market up I rain] ~. (1.20)
In the previous section, we noted that if the market is
. independent of rain, then the probability that the market
will be up and that it will rain must be equal to the probability
of the market being up multiplied by the probability
of rain. To see why this must be true, we simply substitute
the last line of Equation (1.19) into Equation (1.15):
P[market up and rain] = P[market up I rain] . P[rain]
P[market up and rain] = P[market up] . P[rain] (1.21)
Remember that Equation (1.21) is true only if the market
being up and rain are independent. If the weather somehow
affects the stock market, however, then the conditiona
I probabilities might not be equal. We could have a
'~'
situation where:
P[market up I rain] "* P[market up I rain] (1.22)
In this case, the weather and the stock market are no longer
independent. We can no longer multiply their probabilities
together to get their joint probability.
Chapter 1 Probabilities • 9

• Learning Objectives
Candidates, after completing this reading, should be
able to:
• Interpret and apply the mean, standard deviation,
and variance of a random variable.
• Calculate the mean, standard deviation, and variance
of a discrete random variable.
• Calculate and interpret the covariance and
correlation between two random variables.
• Calculate the mean and variance of sums of
variables.
• Describe the four central moments of a statistical
variable or distribution: mean, variance, skewness,
and kurtosis.
• Interpret the skewness and kurtosis of a statistical
distribution, and interpret the concepts of
coskewness and cokurtosis.
• Describe and interpret the best linear unbiased
estimator.
Excerpt is Chapter 3 of Mathematics and Statistics for Financial Risk Management, Second Edition, by Michael B. Miller.
11
In this chapter we will learn how to describe a collection
of data in precise statistical terms. Many of the concepts
will be familiar, but the notation and terminology might
be new.
AVERAGES
Everybody knows what an average is. We come across
averages every day, whether they are earned run averages
in baseball or grade point averages in school. In statistics
there are actually three different types of averages:
means, modes, and medians. By far the most commonly
used average in risk management is the mean.
Population and Sample Data
If you wanted to know the mean age of people working
in your firm, you would simply ask every person in the
firm his or her age, add the ages together, and divide by
the number of people in the firm. Assuming there are n
employees and ai is the age of the ith employee, then the
mean, fL is simply:
1 n 1
J.l=n- La. =-(a +a +···+a +a) i=l I n 1 2 n-l n
(2.1)
It is important at this stage to differentiate between
population statistics and sample statistics. In this example,
fL is the population mean. Assuming nobody lied
about his or her age, and forgetting about rounding
errors and other trivial details, we know the mean age
of the people in your firm exactly. We have a complete
data set of everybody in your firm; we've surveyed the
entire population.
This state of absolute certainty is, unfortunately, quite rare
in finance. More often, we are faced with a situation such
as this: estimate the mean return of stock ABC, given the
most recent year of daily returns. In a situation like this,
we assume there is some underlying data-generating process,
whose statistical properties are constant over time.
The underlying process has a true mean, but we cannot
observe it directly. We can only estimate the true mean
based on our limited data sample. In our example, assuming
n returns, we estimate the mean using the same formula
as before:
A 1 n 1
J.l = -n L r. = -(r. + r. + ... + r + r ) i=l I n 1 2 n-l n
(2.2)
where jl (pronounced "mu hat") is our estimate of the true
mean, based on our sample of n returns. We call this the
sample mean.
The median and mode are also types of averages. They
are used less frequently in finance, but both can be useful.
The median represents the center of a group of data;
within the group, half the data points will be less than the
median, and half will be greater. The mode is the value
that occurs most frequently.
Example 2.1
Question:
Calculate the mean, median, and mode of the following
data set:
-20%, -10%, -5%, -5%, 0%,10%,10%,10%,19%
Answer:
1
Mean: = - (-20% -10% -5% -5% + 0% + 10% + 10%
9
+ 10% + 19%)
=1%
Mode = 10%
Median = 0%
If there is an even number of data points, the median is
found by averaging the two centermost points. In the following
series:
5%, 10%, 20%, 25%
the median is 15%. The median can be useful for summarizing
data that is asymmetrical or contains significant outliers.
A data set can also have more than one mode. If the
maximum frequency is shared by two or more values, all
of those values are considered modes. In the following
example, the modes are 10% and 20%:
5%, 10%, 10%, 10%, 14%, 16%, 20%, 20%, 20%, 24%
In calculating the mean in Equation (2.1) and Equation
(2.2), each data point was counted exactly once. In
certain situations, we might want to give more or less
weight to certain data points. In calculating the average
return of stocks in an equity index, we might want
to give more weight to larger firms, perhaps weighting
their returns in proportion to their market capitalizations.
Given n data points, Xi = Xl' X2' ... , xn with corresponding
weights, Wi' we can define the weighted mean, fLw' as:
12 • Financial Risk Manager Exam Part I: Quantitative Analysis
n
L,WjXj
!lw=~
L,Wj
;=1
(2.3)
The standard mean from Equation (2.1) can be viewed as
a special case of the weighted mean, where all the values
have equal weight.
Discrete Random Variables
For a discrete random variable, we can also calculate the
mean, median, and mode. For a random variable, X, with
possible values, Xj' and corresponding probabilities, Pj' we
define the mean, /L, as:
n
!l = L,PjXj (2.4)
;=1
The equation for the mean of a discrete random variable is
a special case of the weighted mean, where the outcomes
are weighted by their probabilities, and the sum of the
weights is equal to one.
The median of a discrete random variable is the value
such that the probability that a value is less than or equal
to the median is equal to 50%. Working from the other
end of the distribution, we can also define the median
such that 50% of the values are greater than or equal to
the median. For a random variable, X, if we denote the
median as m, we have:
P[X;::: mJ = P[X ~ mJ = 0.50 (2.5)
For a discrete random variable, the mode is the value
associated with the highest probability. As with population
and sample data sets, the mode of a discrete random
variable need not be unique.
Example 2.2
Question:
At the start of the year, a bond portfolio consists of two
bonds, each worth $100. At the end of the year, if a bond
defaults, it will be worth $20. If it does not default, the
bond will be worth $100. The probability that both bonds
default is 20%. The probability that neither bond defaults
is 45%. What are the mean, median, and mode of the
year-end portfolio value?
Answer:
We are given the probability for two outcomes:
P[V = $40J = 20%
P[V =$200J =45%
At year-end, the value of the portfolio, V, can have only
one of three values, and the sum of all the probabilities
must sum to 100%. This allows us to calculate the final
probability:
P[V = $120J = 100% - 20% - 45% = 35%
The mean of V is then $140:
/L = 0.20 . $40 + 0.35 . $120 + 0.45 . $200 = $140
The mode of the distribution is $200; this is the most
likely single outcome. The median of the distribution is
$120; half of the outcomes are less than or~ual to $120.
Continuous Random Variables
We can also define the mean, median, and mode for a
continuous random variable. To find the mean of a continuous
random variable, we simply integrate the product of
the variable and its probability density function (PDF). In
the limit, this is equivalent to our approach to calculating
the mean of a discrete random variable. For a continuous
random variable, X, with a PDF, f(x) , the mean, /L, is then:
!l = l xf(x)dx (2.6)
The median of a continuous random variable is defined
exactly as it is for a discrete random variable, such that
there is a 50% probability that values are less than or
equal to, or greater than or equal to, the median. If we
define the median as m, then:
J f(x)dx = xI' f(x)dx = 0.50 (2.7)
m
Alternatively, we can define the median in terms of the
cumulative distribution function. Given the cumulative distribution
function, F(x), and the median, m, we have:
F(m) = 0.50 (2.8)
The mode of a continuous random variable corresponds
to the maximum of the density function. As before, the
mode need not be unique.
Chapter 2 Basic Statistics • 13
Example 2.3
Question:
Using the now-familiar probability density function from
Chapter 1,
x
f(x) = 50 s.t. 0 ::; x ::; 10
what are the mean, median, and mode of x?
Answer:
As we saw in a previous example, this probability density
function is a triangle, between x = 0 and x = 10, and zero
everywhere else. See Figure 2-1.
For a continuous distribution, the mode corresponds to
the maximurrA the PDF. By inspection of the graph, we
can see that .mode of f(x) is equal to 10.
To calculate the median, we need to find m, such that the
integral of f(x) from the lower bound of f(x), zero, to m is
equal to 0.50,at is, we need to find:
mx f- dx = 0.50
0 50
First we solve the left-hand side of the equation:
0.2
-:! 0.1
o
J:
FIGURE 2-1 Probability density function.
Setting this result equal to 0.50 and solving for m, we
obtain our final answer:
m2
- = 0.50
100
m2 = 50
m = J50 = 7.07
In the last step we can ignore the negative root. If we
hadn't calculated the median, looking at the graph it
might be tempting to guess that the median is 5, the
midpoint of the range of the distribution. This is a common
mistake. Because lower values have less weight, the
median ends up being greater than 5.
The mean is approximately 6.67:
As with the median, it is a common mistake, based on
inspection of the PDF, to guess that the mean is 5. However,
what the PDF is telling us is that outcomes between
5 and 10 are much more likely than values between 0 and
5 (the PDF is higher between 5 and 10 than between 0
and 5). This is why the mean is greater than 5.
EXPECTATIONS
On January 15, 2005, the Huygens space probe
landed on the surface of Titan, the largest moon
of Saturn. This was the culmination of a sevenyear-
long mission. During its descent and for
over an hour after touching down on the surface,
Huygens sent back detailed images, scientific
readings, and even sounds from a strange
world. There are liquid oceans on Titan, the
landing site was littered with "rocks" composed
of water ice, and weather on the moon includes
methane rain. The Huygens probe was named
after Christiaan Huygens, a Dutch polymath
who first discovered Titan in 1655. In addition
to astronomy and physics, Huygens had more
prosaic interests, including probability theory.
Originally published in Latin in 1657, De Ratiociniis
in Ludo Aleae, or On the Logic of Games
of Chance, was one of the first texts to formally
explore one of the most important concepts in
probability theory, namely expectations.
14 III Financial Risk Manager Exam Part I: Quantitative Analysis
Like many of his contemporaries, Huygens was interested
in games of chance. As he described it, if a game has a
50% probability of paying $3 and a 50% probability of
paying $7, then this is, in a way, equivalent to having $5
with certainty. This is because we expect, on average, to
win $5 in this game:
50% . $3 + 50% . $7 = $5 (2.9)
As one can already see, the concepts of expectations and
averages are very closely linked. In the current example, if
we play the game only once, there is no chance of winning
exactly $5; we can win only $3 or $7. Still, even if we play
the game only once, we say that the expected value of the
game is $5. That we are talking about the mean of all the
potential payouts is understood.
We can express the concept of expectations more formally
using the expectation operator. We could state that
the random variable, X, has an expected value of $5 as
follows:
E[X] = 0.50 . $3 + 0.50 . $7 = $5 (2.10)
where E[· J is the expectation operator.'
In this example, the mean and the expected value have
the same numeric value, $5. The same is true for discrete
and continuous random variables. The expected value
of a random variable is equal to the mean of the random
variable.
While the value of the mean and the expected value may
be the same in many situations, the two concepts are not
exactly the same. In many situations in finance and risk
management, the terms can be used interchangeably. The
difference is often subtle.
As the name suggests, expectations are often thought
of as being forward looking. Pretend we have a financial
asset for which next year's mean annual return is known
and equal to 15%. This is not an estimate; in this hypothetical
scenario, we actually know that the mean is 15%. We
say that the expected value of the return next year is 15%.
We expect the return to be 15%, because the probabilityweighted
mean of all the possible outcomes is 15%.
1 Those of you with a background in physics might be more familiar
with the term expectation value and the notation eX) rather
than E[X]. This is a matter of convention. Throughout this book
we use the term expected value and E[ ], which are currently
more popular in finance and econometrics. Risk managers should
be familiar with both conventions.
Now pretend that we don't actually know what the mean
return of the asset is, but we have 10 years' worth of historical
data for which the mean is 15%. In this case the
expected value mayor may not be 15%. If we decide that
the expected value is equal to 15%, based on the data,
then we are making two assumptions: first, we are assuming
that the returns in our sample were generated by the
same random process over the entire sample period; second,
we are assuming that the returns will continue to be
generated by this same process in the future. These are
very strong assumptions. If we have other information that
leads us to believe that one or both of these assumptions
are false, then we may decide that the expected value is
something other than 15%. In finance and risk management,
we often assume that the data we are interested in
are being generated by a consistent, unchaoging process.
Testing the validity of this assumption can be an important
part of risk management in practice.
The concept of expectations is also a much more general
concept than the concept of the mean. Using the expectation
operator, we can derive the expected value of functions
of random variables. As we will see in subsequent
sections, the concept of expectations underpins the definitions
of other population statistics (variance, skewness,
kurtosis), and is important in understanding regression
analysis and time series analysis. In these cases, even when
we could use the mean to describe a calculation, in practice
we tend to talk exclusively in terms of expectations.
Example 2.4
Question:
At the start of the year, you are asked to price a newly
issued zero coupon bond. The bond has a notional of
$100. You believe there is a 20% chance that the bond will
default, in which case it will be worth $40 at the end of
the year. There is also a 30% chance that the bond will be
downgraded, in which case it will be worth $90 in a year's
time. If the bond does not default and is not downgraded,
it will be worth $100. Use a continuous interest rate of 5%
to determine the current price of the bond.
Answer:
We first need to determine the expected future value of
the bond-that is, the expected value of the bond in one
year's time. We are given the following:
P[Vt+l = $40J = 0.20
P[Vt+1 = $90J = 0.30
Chapter 2 Basic Statistics II 15
Because there are only three possible outcomes, the probability
of no downgrade and no default must be 50%:
P[Vt+1 = $100J = 1 - 0.20 - 0.30 = 0.50
The expected value of the bond in one year is then:
E[Vt+1J = 0.20· $40 + 0.30' $90 + 0.50 . $100 = $85
To get the current price of the bond we then discount this
expected future value:
E[Vt] = e-O.05 E[Vt+1J = e-o .o5$85 = $80.85
The current price of the bond, in this case $80.85, is
often referred to as the present value or fair value of the
bond. The price is considered fair because the discounted
expected value of the bond is the price that a risk-neutral
investor would pay for the bond.
The expectation operator is linear. That is, for two random
variables, X and Y, and a constant, c, the following two
equations are true:
E[X + Y] = E[X] + E[Y]
E[cX] = cE[X] (2.11)
If the expected value of one option, A, is $10, and the
expected value of option B is $20, then the expected
value of a portfolio containing A and B is $30, and the
expected value of a portfolio containing five contracts of
option A is $50.
Be very careful, though; the expectation operator is not
multiplicative. The expected value of the product of two
random variables is not necessarily the same as the product
of their expected values:
E[Xy] '* E[X]E[Y] (2.12)
Imagine we have two binary options. Each pays either
$100 or nothing, depending on the value of some underlying
asset at expiration. The probability of receiving $100 is
50% for both options. Further, assume that it is always the
case that if the first option pays $100, the second pays $0,
and vice versa. The expected value of each option separately
is clearly $50. If we denote the payout of the first
option as X and the payout of the second as Y, we have:
E[X] = E[y] = 0.50 . $100 + 0.50 . $0 = $50 (2.13)
It follows that E[X]E[Y] = $50 x $50 = $2,500. In each
scenario, though, one option is valued at zero, so the
product of the payouts is always zero: $100 . $0 = $0 .
$100 = $0. The expected value of the product of the two
option payouts is:
E[Xy] = 0.50 . $100 . $0 + 0.50 . $0 . $100 = $0 (2.14)
In this case, the product of the expected values and the
expected value of the product are clearly not equal. In the
special case where E[Xy] = E[X]E[y]' we say that X and Y
are independent.
If the expected value of the product of two variables does
not necessarily equal the product of the expectations of
those variables, it follows that the expected value of the
product of a variable with itself does not necessarily equal
the product of the expectation of that variable with itself;
that is:
(2.15)
Imagine we have a fair coin. Assign heads a value of + 1
and tails a value of -1. We can write the probabilities of
the outcomes as follows:
P[X = +lJ = P[X = -lJ = 0.50
The expected value of any coin flip is zero, but the
expected value of X2 is +1, not zero:
E[X] = 0.50 . (+1) + 0.50' (-1) = 0
E[X]2 = 0 2 = 0
E[X2J = 0.50' (+12) + 0.50' (-12) = 1
(2.16)
(2.17)
As simple as this example is, this distinction is very important.
As we will see, the difference between E[X2J and
E[X]2 is central to our definition of variance and standard
deviation.
Example 2.5
Question:
Given the following equation,
y = (x + 5)3 + X2 + lOx
what is the expected value of y? Assume the following:
Answer:
E[x] = 4
E[X2J = 9
E[X3J = 12
Note that E[x2] and E[x3J cannot be derived from knowledge
of E[x]. In this problem, E[x2J '* E[xJ2 and E[x3J '*
E[XJ3.
To find the expected value of y, then, we first expand the
term (x + 5)3 within the expectation operator:
16 II Financia,l Risk Manager Exam Part I: Quantitative Analysis
E[y] = E[(x + 5)3 + X2 + lOx] = E[x3 + 16x2 + 85x + 125]
Because the expectation operator is linear, we can separate
the terms in the summation and move the constants
outside the expectation operator:
E[y] = E[x3 ] + E[16x2] + E[85x] + E[125]
= E[x3 ] + 16E[x2] + 85E[x] + 125
At this point, we can substitute in the values for E[x],
E[x2], and E[x3], which were given at the start of the
exercise:
E[y] =12 + 16 . 9 + 85 . 4 + 125 = 621
This gives us the final answer, 621.
VARIANCE AND STANDARD
DEVIATION
The variance of a random variable measures how noisy or
unpredictable that random variable is. Variance is defined
as the expected value of the difference between the variable
and its mean squared:
(J'2 = E[ (X - 1..1,)2] (2.18)
where (J'2 is the variance of the random variable X with
mean fl.
The square root of variance, typically denoted by (J', is
called standard deviation. In finance we often refer to
standard deviation as volatility. This is analogous to referring
to the mean as the average. Standard deviation is a
mathematically precise term, whereas volatility is a more
general concept.
Example 2.6
Question:
A derivative has a 50/50 chance of being worth either
+ 10 or -10 at expiry. What is the standard deviation of the
derivative's value?
Answer:
fL = 0.50· 10 + 0.50 . (-10) = 0
(J'2 = 0.50' (10 - 0)2 + 0.50' (-10 - 0)2
= 0.5 . 100 + 0.5 . 100 = 100
(J' = 10
In the previous example, we were calculating the population
variance and standard deviation. All of the possible
outcomes for the derivative were known.
To calculate the sample variance of a random variable X
based on n observations, Xl' x2' ... , xn we can use the following
formula:
0'2' = --1£ ~,.( x. -I'l )
x n -1 i=l I x
E[6'~] = 0'\ (2.19)
where Ax is the sample mean as in Equation (2.2). Given
that we have n data points, it might seem odd that we
are dividing the sum by (n - 1) and not n. The reason
has to do with the fact that Ax itself is an estimate of
the true mean, which also contains a fraction of each xi'
It turns out that dividing by (n - 1), not n, produces an
unbiased estimate of (J'2. If the mean is known or we are
calculating the population variance, then we divide by
n. If instead the mean is also being estimated, then we
divide by n - 1.
Equation (2.18) can easily be rearranged as follows (the
proof of this equation is also left as an exercise):
(2.20)
Note that variance can be nonzero only if E[X2] =1= E[X]2,
When writing computer programs, this last version of the
variance formula is often useful, since it allows us to calculate
the mean and the variance in the same loop.
In finance it is often convenient to assume that the mean
of a random variable is equal to zero. For example, based
on theory, we might expect the spread between two
equity indexes to have a mean of zero in the long run. In
this case, the variance is simply the mean of the squared
returns.
Example 2.7
Question:
Assume that the mean of daily Standard & Poor's (S&P)
500 Index returns is zero. You observe the following
returns over the course of 10 days:
Estimate the standard deviation of daily S&P 500 Index
returns.
Chapter 2 Basic Statistics • 17

No comments:

Post a Comment