VolGAN

Deep Learning

March 2025

Abstract

( Paper Notes )

Notes on 'VolGAN: a generative model for arbitrage-free implied volatility surfaces'

Quick TLDR.

VolGAN is a GAN with it's generator trained to optimize an objective function a regularization term for the moneyness (MM) dimension and the time to maturity (τ\tau) dimension to avoid arbitrage when generating the implied volatility surface.

It's discriminator, as similar to a conditional GAN takes in the output to the generator and a prior set of data points for the past implied volatility surface, to play the minimax game where the generator is trained to output synthetic realistic scenarios relative to the past implied volatility surface.

The outputs to the generator are reweighted which are a means to reduce the arbitrage, forcing the generator to produce arbitrage free values.

Abstract

VolGAN is a generative adverserial network for generating synthetic arbitrage-free implied volatility surfaces

Where arbitrage is when traders can exploit abrupt changes in the IVS, arbitrage-free indicates a smooth IVS with no illogical pricing errors.

VolGAN is trained on a series of implied volatility surfaces and underlying prices of each option to generate a realistic IVS and the underlying price of the asset.

Accounts for not only scenarios based on Gaussian probabilities, but abrupt non-Guassian changes in the IVS, in a logical arbitrage-free manner.

A generative model for implied volatility surfaces.

As input, VolGAN receives (1) the implied volatility surface at previous date, (2) two previous underlying returns (3) the realized volatility from the previous period to output (1) the return of the underlying asset and (2) the implied volatility surface.

The architecture is a Conditional GAN, composed of a generator and discriminator (as is typical of GANs).

The inputs to the GAN are,

gt(m,τ)=logσt(m,τ)rt=log(St+1St)γt=21252i=020rti2g_t(m, \tau) = \log \sigma_t(m, \tau) \\[3mm] r_t = \log\left(\frac{S_{t+1}}{S_t}\right) \\[3mm] \gamma_t = \sqrt{\frac{21}{252} \sum_{i = 0}^{20} r^2_{t - i}}

where gt()g_t(\cdot) is the implied volatility at tt in log-space, rtr_t is the return at tt in log-space, and γt\gamma_t is the historic volatility over a one-month trading period (21 days out of 252 total possible trading days.).

All aggregated into an input vector aa,

at=(rt1,tt2,γt1,gt(m,τ))a_t = (r_{t-1}, t_{t-2}, \gamma_{t - 1}, g_t(m, \tau))

The generator takes in aa with i.i.d noise, ztN(0,Id)z_t \sim \mathcal{N}(0, I_d) where II is the identity (serving as covariance) matrix and outputs synthetic log-space volatility and returns,

G(at,zt)=(r^(zt),Δg^t(m,τ)(zt))G(a_t, z_t) = (\hat{r}(z_t), \Delta \hat{g}_t(m, \tau)(z_t))

where r^t(zt)\hat{r}_t(z_t) is the synthetic predicted return and Δg^t(m,τ)(zt)\Delta \hat{g}_t(m, \tau)(z_t) is the synthetic IVS both in log-space.

The discrminator D()D(\cdot) is a classifier which takes in an input value (r,Δg)(r, \Delta g), either the output of the generator or the ground-truth realization of the data with a condition vector ata_t as defined previously.

D(a,(r,Δg))D(a, (r, \Delta g)) outputs a probability that the input (r,Δg)(r, \Delta g) is drawn from the distribution of (rt,Δgt)(r_t, \Delta g_t) given ata_t.

Essentially verifying that the output of the generator is probable to have been induced from a prior volatility surface.

Both GG and DD are defined as feed-forward neural networks with their own respective parameters.

GG is a two layer neural network with a hidden layer Rh\in \mathbb{R}^h, a second hidden layer R2H\in \mathbb{R}^{2H}, with a final dense layer of size RNt×Nk\mathbb{R}^{Nt \times Nk} outputting the log-implied volatility surface increment with the simulated return of the underlying asset.

The discrminator takes the simulated return with aa, through a hidden layer of size HH and outputs a single probability value, denoting the probability that the simulated return is inductively sampled from the data.

Training Objective

The loss function for the generator is smoothed as,

Lm(g)=i,j(g(mi+1,τj)g(mi,τj))2mi+1mi2mgL22L_m(g) = \sum_{i,j} \frac{(g(m_{i+1}, \tau_j) - g(m_i, \tau_j))^2}{|m_{i+1} - m_i|^2} \approx \|\partial_m g\|_{L^2}^2 Lτ(g)=i,j(g(mi,τj+1)g(mi,τj))2τj+1τj2τgL22L_\tau(g) = \sum_{i,j} \frac{(g(m_i, \tau_{j+1}) - g(m_i, \tau_j))^2}{|\tau_{j+1} - \tau_j|^2} \approx \|\partial_\tau g\|_{L^2}^2 J(G)(θd,θg)=12E[log(D(at,G(at,z;θg);θd))+αmLm(gt(m,τ)+G(at,zt;θg)2:)+βτLτ(gt(m,τ)+G(at,zt;θg)2:)]J(G)(\theta_d, \theta_g) = -\frac{1}{2} \mathbb{E} [ \log (D (a_t, G(a_t, z; \theta_g); \theta_d)) \\[3mm] + \alpha_m L_m (g_t(m, \tau) \\[3mm] + G(a_t, z_t; \theta_g)|_{2:}) + \beta_\tau L_\tau (g_t(m, \tau) + G(a_t, z_t; \theta_g)|_{2:})]

where Lm(g)L_m(g) is smoothing the moneyness dimension while Lτ(g)L_{\tau}(g) is smoothing the time to maturity dimension.

The discriminator simply minimizes the binary cross-entropy loss.

Scenario re-weighting

Assume P0\mathbb{P}_0 is the joint probability distribution output by the generator, (rt,σ(m,τ))(r_t, \sigma(m, \tau)).

To remove opportunities for arbitrage, they reweight the probability P0\mathbb{P}_0 by sampling using a weighting function to the generated samples in the softmax, as:

wi=exp(βΦ(σ^i))j=1Nexp(βΦ(σj^))w^i = \frac{\exp(-\beta \Phi(\hat{\sigma}_i))}{\sum_{j = 1}^N \exp(-\beta\Phi(\hat{\sigma_j}))}

Each weighted scenario is used to compute expectations of various quantities of interest under Pβ\mathbb{P}_{\beta}

Assume XX is the return of the option, then you can quantify the expected return over NN steps as

Eβ[X]=i=1Nwixi\mathbb{E}_{\beta}[X] = \sum_{i = 1}^N w_ix_i

or for any defintion of XX.