34. Elementary Asset Pricing Theory

34.1. Overview

This lecture summarizes the heart of applied asset-pricing theory.

From a single equation, we’ll derive

  • a mean-variance frontier

  • a single-factor model of excess returns on each member of a collection of assets

To do this, we use two ideas:

  • an asset pricing equation

  • a Cauchy-Schwartz inequality

For background and basic concepts, see our lecture orthogonal projections and their applications.

As a sequel to the material here, please see our lecture two modifications of mean-variance portfolio theory.

34.2. Key Equation

We begin with a key asset pricing equation:

(34.1)\[ E m R^i = 1 \]

for \(i=1, \ldots, I\) and where

\[\begin{split} \begin{aligned} m &=\text { stochastic discount factor } \\ R^{i} &= \text {random gross return on asset } i \\ E &\sim \text { mathematical expectation } \end{aligned} \end{split}\]

The random gross returns \(R^i\) and the scalar stochastic discount factor \(m\) live live in a common probability space.

[HR87] and [HJ91] explain how the existence of a scarlar stochastic discount factor that verifies equation (34.1) is implied by a law of one price that requires that all portfolios of assets that end up having the same payouts must have the same price.

They also explain how the absence of an arbitrage implies that the stochastic discount factor \(m \geq 0\).

34.3. Implications of Key Equation

We combine key equation (34.1) with a remark of Lars Peter Hansen that “asset pricing theory is all about covariances”.


Lars Hansen’s remark is a concise summary of ideas in [HR87] and [HJ91]. For other important foundations of these ideas, see [Ros76], [Ros78], [HK79], [Kre81], and [CR83].

By that remark, Lars Hansen meant that interesting restrictions can be deduced by recognizing that \(E m R^i\) is a component of the covariance between \(m \) and \(R^i\) and then using that fact to rearrange key equation (34.1).

Let’s do this step by step.

First note that the definition \(\operatorname{cov}\left(m, R^{i}\right) = E (m - E m)(R^i - E R^i) \) of a covariance implies that

\[ E m R^i = E m E R^{i}+\operatorname{cov}\left(m, R^{i}\right) \]

Substituting this result into key equation (34.1) gives

(34.2)\[ 1 = E m E R^{i}+\operatorname{cov}\left(m, R^{i}\right) \]

Next note that for a risk-free asset with non-random gross return \(R^f\), equation (34.1) becomes

\[ 1 = E R^f m = R^f E m. \]

This is true because we can pull the constant \(R^f\) outside the mathematical expectation.

It follows that the gross return on a risk-free asset is

\[ R^{f} = 1 / E(m) \]

Using this formula for \(R^f\) in equation (34.2) and rearranging, it follows that

\[ R^{f} = E R^{i}+\operatorname{cov}\left(m, R^{i} \right) R^{f} \]

which can be rearranged to become

\[ E R^i = R^{f}-\operatorname{cov}\left(m, R^{i}\right) R^{f} . \]

It follows that we can express an excess return \(E R^{i}-R^{f}\) on asset \(i\) relative to the risk-free rate as

(34.3)\[ E R^{i}-R^{f} = -\operatorname{cov}\left(m, R^{i}\right) R^{f} \]

Equation (34.3) can be rearranged to display important parts of asset pricing theory.

34.4. Expected Return - Beta Representation

We can obtain the celebrated expected-return-Beta -representation for gross return \(R^i\) simply by rearranging excess return equation (34.3) to become

\[ E R^{i}=R^{f}+\left(\underbrace{\frac{\operatorname{cov}\left(R^{i}, m\right)}{\operatorname{var}(m)}}_{\quad\quad\beta_{i,m} = \text{regression coefficient}}\right)\left(\underbrace{-\frac{\operatorname{var}(m)}{E(m)}}_{\quad\lambda_{m} = \text{price of risk}}\right) \]


(34.4)\[ E R^{i}=R^{f}+\beta_{i, m} \lambda_{m} \]


  • \(\beta_{i,m}\) is a (population) least squares regression coefficient of gross return \(R^i\) on stochastic discount factor \(m\), an object that is often called asset \(i\)’s beta

  • \(\lambda_m\) is minus the variance of \(m\) divided by the mean of \(m\), an object that is often called the price of risk.

To interpret this representation it helps to provide the following widely used example.


A popular model of \(m\) is

\[ m_{t+1} = \exp(-\rho) \exp(- \gamma(c_{t+1} - c_t)) \]

where \( \rho > 0\), \(\gamma > 0\), and the log of consumption growth is governed by

\[ c_{t+1} - c_t = \mu + \sigma_c \epsilon_{t+1} \]

where \(\epsilon_{t+1} \sim {\mathcal N}(0,1)\).


  • \(\gamma >0\) is a coefficient of relative risk aversion

  • \(\rho >0 \) is a fixed intertemporal discount rate

\[ m_{t+1} = \exp(-\rho) \exp( - \gamma \mu - \gamma \sigma_c \epsilon_{t+1}) \]

In this case

\[ E m_{t+1} = \exp(-\rho) \exp \left( - \gamma \mu + \frac{\sigma_c^2 \gamma^2}{2} \right) \]


\[ \operatorname{var}(m_{t+1}) = E(m) [ \exp(\sigma_c^2 \gamma^2) - 1) ] \]

When \(\gamma >0\), it is true that

  • when consumption growth is high, \(m\) is low

  • when consumption growth is low, \(m\) is high

According to representation (34.4), an asset with an \(R^i\) that can be expected to be high when consumption growth is low has \(\beta_i\) positive and a low expected return.

  • because it has a high gross return when consumption growth is low, it is a good hedge against consumption risk. That justifies its low average return

An asset with an \(R^i\) that is low when consumption growth is low has \(\beta_i\) negative and a high expected return.

  • because it has a low gross return when consumption growth is low, it is a poor hedge against consumption risk. That justifies its high average return

34.5. Mean-Variance Frontier

Now we’ll derive the celebrated mean-variance frontier.

We do this using a classic method of Lars Peter Hansen and Scott Richard [HR87].


Methods of Hansen and Richard are described and used extensively by [Coc05].

Their idea was rearrange the key equation (34.1), namely, \(E m R^i = 1\), and then to apply the Cauchy-Schwarz inequality.

A convenient way to remember the Cauchy-Schwartz inequality in our context is that it says that an \(R^2\) in any regression has to be less than or equal to \(1\).

Let’s apply that idea to deduce

(34.5)\[ 1= E\left(m R^{i}\right)=E(m) E\left(R^{i}\right)+\rho_{m, R^{i}}\frac{\sigma(m)}{E(m)} \sigma\left(R^{i}\right) \]

where \(\rho_{m, R^i}\) is the correlation coefficient defined as

\[ \rho_{m, R^i} \equiv \frac{\operatorname{cov}\left(m, R^{i}\right)}{\sigma(m) \sigma\left(R^{i}\right)} \]

and where \(\sigma\) denotes the standard deviation of the variable in parentheses

Equation (34.5) implies

\[ E R^{i}=R^{f}-\rho_{m, R^i} \frac{\sigma(m)}{E(m)} \sigma\left(R^{i}\right) \]

Because \(\rho_{m, R^i} \in [-1,1]\), it follows that \(|\rho_{m, R^i}| \leq 1\) and that

(34.6)\[ \left|E R^i-R^{f}\right| \leqslant \frac{\sigma(m)}{E(m)} \sigma\left(R^{i}\right) \]

Inequality (34.6) delineates a mean-variance frontier

(Actually, it looks more like a mean-standard-deviation frontier)

Evidently, points on the frontier correspond to gross returns that are perfectly correlated (either positively or negatively) with the stochastic discount factor \(m\).

We summarize this observation as

\[\begin{split} \rho_{m, R^{i}}=\left\{\begin{array}{ll} +1 & \implies R^i \text { is on lower frontier } \\ -1 & \implies R^i \text { is on an upper frontier } \end{array}\right. \end{split}\]

The image below illustrates a mean-variance frontier.


The mathematical structure of the mean-variance frontier described by inequality (34.6) implies that

  • all returns on frontier are perfectly correlated.


    • Let \(R^m, R^{mv}\) be two returns on frontier.

    • Then for some scalar \(a\)

    • \(R^{m v}=R^{f}+a\left(R^{m}-R^{f}\right)\)

    This is an exact equation with no residual

  • each return \(R^i\) that is on the mean-variance frontier is perfectly correlated with \(m\)

    • \(\left(\rho_{m, R^{i}}=-1\right) \Rightarrow \begin{cases} m=a+b R^{m v} \\ R^{m v}=e+d m \end{cases}\) for some scalars \(a, b, e, d\),

    Therefore, any return on the mean-variance frontier is a legitimate stochastic discount factor

  • for any mean-variance-efficient return \(R^{m v}\) that is on the frontier but that is not \(R^{f}\), there exists a single-beta representation for any return \(R^i\) that takes the form:

(34.7)\[ E R^{i}=R^{f}+\beta_{i, R^{m v}}\left[E\left(R^{m v}\right)-R^{f}\right] \]
  • The special case of a single-beta representation (34.7) with \( R^{i}=R^{m v}\) is

    \(E R^{m v}=R^{f}+1 \cdot\left[E\left(R^{m v}\right)-R^{f}\right] \)

34.6. Empirical Implementations

We briefly describe empirical implementations of multi-factor generalizations of the single-factor model described above.

The single-beta representation (34.7) is a special case with there being just a single factor.

Two representations are often used in empirical work.

One is a time-series regression of gross return \(R_t^i\) on multiple risk factors \(f_t^j, j = a, b, \ldots \) that is designed to uncover exposures of return \(R^i\) to each of a set of risk-factors \(f_t^j, j = a, b, \ldots, \):

\[\begin{split} R_{t}^{i}=a_{i}+\beta_{i, a} f_{t}^{a}+\beta_{i, b} f_{t}^{b}+\ldots+\epsilon_{t}^{i}, \quad t=1,2, \ldots, T\\ \epsilon_{t}^{i} \perp f_{t}^{j}, i=1,2, \ldots, I; j = a, b, \ldots \end{split}\]

For example:

  • a popular single-factor model specifies the single factor \(f_t\) to be the return on the market portfolio

  • another popular single-factor model called the consumption based model specifies the factor to be \( m_{t+1} = \beta \frac{u^{\prime}\left(c_{t+1}\right)}{u^{\prime}\left(c_{t}\right)}\), where \(c_t\) is a representative consumer’s time \(t\) consumption.

Model objects are interpreted as follows:

  • \(\beta_{i,a}\) is the exposure of return \(R^i\) to factor \(f_a\) risk

  • \(\lambda_{a}\) is the price of exposure to factor \(f_a\) risk

The other representation entails a cross-section regression of average returns \(E R^i\) for assets \(i =1, 2, \ldots, I\) on prices of risk \(\lambda_j\) for \(j =a, b, c, \ldots\)

Here is the regression specification:

\[ E R^{i} =\gamma+\beta_{i, a} \lambda_{a}+\beta_{i, b} \lambda_{b}+\cdots \]

Testing strategies:

Time-series and cross-section regressions play roles in both estimating and testing beta representation models.

The basic idea is to implement the following two steps.

Step 1:

  • Estimate \(a_{i}, \beta_{i, a}, \beta_{i, b}, \cdots\) by running a time series regression: \(R_{t}^{i}\) on a constant and \(f_{t}^{a}, f_{t}^{b}, \ldots\)

Step 2:

  • take the \(\beta_{i, j}\)’s estimated in step one as regressors together with data on average returns \(E R^i\) over some period and then estimate the cross-section regression

\[ \underbrace{E\left(R^{i}\right)}_{\text{average return over time series}}=\gamma+\underbrace{\beta_{i, a}}_{\text{regressor}\quad} \underbrace{\lambda_{a}}_{\text{regression}\text{coefficient}}+\underbrace{\beta_{i, b}}_{\text{regressor}\quad} \underbrace{\lambda_{b}}_{\text{regression}\text{coefficient}}+\cdots+\underbrace{\alpha_{i}}_{\text{pricing errors}}, i=1, \ldots, I; \quad \underbrace{\alpha_i \perp \beta_{i,j},j = a, b, \ldots}_{\text{least squares orthogonality condition}} \]
  • estimate \(\gamma, \lambda_{a}, \lambda_{b}, \ldots\) by an appropriate regression technique, being thoughtful about recognizing that the regressors have been generated by a step 1 regression.

Note that presumably the risk-free return \(E R^{f}=\gamma\).

For excess returns \(R^{ei} = R^i - R^f\) we have

\[ E R^{e i}=\beta_{i, a} \lambda_{a}+\beta_{i, b} \lambda_{b}+\cdots+\alpha_{i}, i=1, \ldots, I \]

34.7. Exercises

Let’s start with some imports.

import numpy as np
from scipy.stats import stats
import statsmodels.api as sm
from statsmodels.sandbox.regression.gmm import GMM
import matplotlib.pyplot as plt
%matplotlib inline

Lots of our calculations will involve computing population and sample OLS regressions.

So we define a function for simple univariate OLS regression that calls the OLS routine from statsmodels.

def simple_ols(X, Y, constant=False):

    if constant:
        X = sm.add_constant(X)

    model = sm.OLS(Y, X)
    res = model.fit()

    β_hat = res.params[-1]
    σ_hat = np.sqrt(res.resid @ res.resid / res.df_resid)

    return β_hat, σ_hat

34.7.1. Exercise 1

Look at the equation,

\[ R^i_t - R^f = \beta_{i, R^m} (R^m_t - R^f) + \sigma_i \varepsilon_{i, t}. \]

Verify that this equation is a regression equation.

34.7.2. Exercise 2

Give a formula for the regression coefficient \(\beta_{i, R^m}\).

34.7.3. Exercise 3

Recall our earlier discussions of a direct problem and an inverse problem.

  • A direct problem is about simulating a particular model.

  • An inverse problem is about using data to estimate or choose a particular model from a manifold of models.

Please assume the parameter values set below and then simulate 2000 observations from the theory specified above for 5 assets, \(i = 1, \ldots, 5\).

\[\begin{align*} E\left[R^f\right] &= 0.02 \\ \sigma_f &= 0.00 \\ \xi &= 0.06 \\ \lambda &= 0.04 \\ \beta_{1, R^m} &= 0.2 \\ \sigma_1 &= 0.04 \\ \beta_{2, R^m} &= .4 \\ \sigma_2 &= 0.04 \\ \beta_{3, R^m} &= .6 \\ \sigma_3 &= 0.04 \\ \beta_{4, R^m} &= .8 \\ \sigma_4 &= 0.04 \\ \beta_{5, R^m} &= 1.0 \\ \sigma_5 &= 0.04 \end{align*}\]

More Exercises

Now come some even more fun parts!

Our theory implies that there exist values of two scalars, \(a\) and \(b\), such that a legitimate stochastic discount factor is:

\[ m_t = a + b R^m_t \]

The parameters \(a, b\) must satisfy the following equations:

\[\begin{align*} E[(a + b R_t^m) R^m_t)] &= 1 \\ E[(a + b R_t^m) R^f_t)] &= 1 \end{align*}\]

34.7.4. Exercise 4

Using the equations above, find a system of two linear equations that you can solve for \(a\) and \(b\) as functions of the parameters \((\lambda, \xi, E[R_f])\).

Write a function that can solve these equations.

Please check the condition number of a key matrix that must be inverted to determine a, b

34.7.5. Exercise 5

Using the estimates of the parameters that you generated above, compute the implied stochastic discount factor.

34.8. Solutions

34.8.1. Solution to Exercise 1

To verify that it is a regression equation we must show that the residual is orthogonal to the regressor.

Our assumptions about mutual orthogonality imply that

\[ E\left[\epsilon_{i,t}\right]=0,\quad E\left[\epsilon_{i,t}u_{t}\right]=0 \]

It follows that

\[\begin{split} \begin{aligned} E\left[\sigma_{i}\epsilon_{i,t}\left(R_{t}^{m}-R^{f}\right)\right]&=E\left[\sigma_{i}\epsilon_{i,t}\left(\xi+\lambda u_{t}\right)\right] \\ &=\sigma_{i}\xi E\left[\epsilon_{i,t}\right]+\sigma_{i}\lambda E\left[\epsilon_{i,t}u_{t}\right] \\ &=0 \end{aligned} \end{split}\]

34.8.2. Solution to Exercise 2

The regression coefficient \(\beta_{i, R^m}\) is

\[ \beta_{i,R^{m}}=\frac{Cov\left(R_{t}^{i}-R^{f},R_{t}^{m}-R^{f}\right)}{Var\left(R_{t}^{m}-R^{f}\right)} \]

34.8.3. Solution to Exercise 3

Direct Problem:

# Code for the direct problem

# assign the parameter values
ERf = 0.02
σf = 0.00 # Zejin: Hi tom, here is where you manipulate σf
ξ = 0.06
λ = 0.08
βi = np.array([0.2, .4, .6, .8, 1.0])
σi = np.array([0.04, 0.04, 0.04, 0.04, 0.04])
# in this cell we set the number of assets and number of observations
# we first set T to a large number to verify our computation results
T = 2000
N = 5
# simulate i.i.d. random shocks
e = np.random.normal(size=T)
u = np.random.normal(size=T)
ϵ = np.random.normal(size=(N, T))
# simulate the return on a risk-free asset
Rf = ERf + σf * e

# simulate the return on the market portfolio
excess_Rm = ξ + λ * u
Rm = Rf + excess_Rm

# simulate the return on asset i
Ri = np.empty((N, T))
for i in range(N):
    Ri[i, :] = Rf + βi[i] * excess_Rm + σi[i] * ϵ[i, :]

Now that we have a panel of data, we’d like to solve the inverse problem by assuming the theory specified above and estimating the coefficients given above.

# Code for the inverse problem

Inverse Problem:

We will solve the inverse problem by simple OLS regressions.

  1. estimate \(E\left[R^f\right]\) and \(\sigma_f\)

ERf_hat, σf_hat = simple_ols(np.ones(T), Rf)
ERf_hat, σf_hat
(0.02000000000000003, 3.123283175179055e-17)

Let’s compare these with the true population parameter values.

ERf, σf
(0.02, 0.0)
  1. \(\xi\) and \(\lambda\)

ξ_hat, λ_hat = simple_ols(np.ones(T), Rm - Rf)
ξ_hat, λ_hat
(0.059287385314110964, 0.07971312581823618)
ξ, λ
(0.06, 0.08)
  1. \(\beta_{i, R^m}\) and \(\sigma_i\)

βi_hat = np.empty(N)
σi_hat = np.empty(N)

for i in range(N):
    βi_hat[i], σi_hat[i] = simple_ols(Rm - Rf, Ri[i, :] - Rf)
βi_hat, σi_hat
(array([0.19498268, 0.40093838, 0.59399131, 0.80000741, 1.01294776]),
 array([0.04096742, 0.03973124, 0.03938268, 0.03988039, 0.03930377]))
βi, σi
(array([0.2, 0.4, 0.6, 0.8, 1. ]), array([0.04, 0.04, 0.04, 0.04, 0.04]))

Q: How close did your estimates come to the parameters we specified?

34.8.4. Solution to Exercise 4

(34.8)\[\begin{align} a ((E(R^f) + \xi) + b ((E(R^f) + \xi)^2 + \lambda^2 + \sigma_f^2) & =1 \cr a E(R^f) + b (E(R^f)^2 + \xi E(R^f) + \sigma_f ^ 2) & = 1 \end{align}\]
# Code here
def solve_ab(ERf, σf, λ, ξ):

    M = np.empty((2, 2))
    M[0, 0] = ERf + ξ
    M[0, 1] = (ERf + ξ) ** 2 + λ ** 2 + σf ** 2
    M[1, 0] = ERf
    M[1, 1] = ERf ** 2 + ξ * ERf + σf ** 2

    a, b = np.linalg.solve(M, np.ones(2))
    condM = np.linalg.cond(M)

    return a, b, condM

Let’s try to solve \(a\) and \(b\) using the actual model parameters.

a, b, condM = solve_ab(ERf, σf, λ, ξ)
a, b, condM
(87.49999999999999, -468.7499999999999, 54.406619883717504)

34.8.5. Solution to Exercise 5

Now let’s pass \(\hat{E}(R^f), \hat{\sigma}^f, \hat{\lambda}, \hat{\xi}\) to the function solve_ab.

a_hat, b_hat, M_hat = solve_ab(ERf_hat, σf_hat, λ_hat, ξ_hat)
a_hat, b_hat, M_hat
(86.98935163683406, -466.5225305424599, 53.873427459523164)