Saturday, September 26, 2015

Charting in R - Part I

The toughest step to doing things like quantitative analysis for trading and backtesting is to understand how to use the tools available in the market. The choices available are as follows:

  1. Excel
  2. R
  3. Python
  4.  C/C++

Excel allows you to see and manipulate data easily but you cannot see trend properly. For example, if you wish to chart data in Excel, this is what you will get:



This graph looks acceptable for simple presentation. But if you want to do analysis, you would want to be able to add a complex indicator easily. In Excel, you can only do that by manipulate all the data in excel and plot another graph on it. To do complex analysis of data it will be much better to do it with scripting API available in the market. If you know your stuff, you can do things pretty fast. But it takes a hell lot of effort to even get started. To illustrate the advantage of using API, the following is created using GOOGLE prices and R.



As someone who had tried both Python and R, my opinion is that the later is a better tool to do charting. So I documented some tips to get you started on R.

What you need to install
R(Here you will find the latest version but I am using version 2.15)
Quantmod (To quickly install, follow instruction here to download and install all that you need)

Once you are done, you can quickly load data with the following sample code (modified from code taken from here) and data.

Code:

library(quantmod)
require (zoo)
dataFile_strip2 <- "testinput.csv"
dataDir <- "C:/Users/Ken/Documents/R/CSV Data"
file2 <- paste(dataDir,"/",dataFile_strip2,sep="")
zz <- read.zoo(file2,header = TRUE,sep=",",index.column=1, format = "%d/%m/%Y %H:%M")
xx<- as.xts(zz)
chartSeries(xx)


testinput.csv:
DateTime,Close,High,Low,Open,Volume
12/08/2015 21:30:00,112.53,112.61,112.5,112.5,877727

Other Reference:
https://cran.r-project.org/web/packages/zoo/vignettes/zoo-read.pdf

Friday, April 3, 2015

Using C++ and VBA in Excel - Part II

One way of doing this is as stated in my previous post. But many beginner in C++ would be daunted by the amount of basics in Computing required to understand why it work. To simplify this process, we can use XLW to implement wrapper easily. The only requirement is to know basic C++ and a simple 3 step process.

Step 1
First, you will need Microsoft Visual Studio and install XLW onto your PC. Use Template Extractor:




Choose the Visual Studio version correctly. Note that if you are using Visual Studio 2012 you will need to choose VC10 instead of VC12.

Step 2
The template you just created is a skeleton VS project which allow you to develop XLL. Once you fire up template.sln, you will notice that the project is setup nicely for you to start development.

As an example, I did a test function that return the square of a number:


As usual, for every function you create in C++ you will need to declare it in the header. But this time, you will need to declare it onto cppinterface.h. This is the most important step you need to do in C++ and you are ready to compile the project.

Step 3
The compilation will create the XLL file and you will be able to find it in the output folder. In this case, I am using debug mode and "Template.xll" will therefore appear in the debug folder. Launch "Template.xll" and you should not encounter any error, if so then there must be an error in the compilation process. If there are no error then you can create a new worksheet and start using the "helloworld" function you just created.




Tuesday, October 7, 2014

Executing an executable from VBA

For people in the software industries such as consultants, developers and even managers, they will very often need to handle different versions of software. Many a times, these versions are stored in different places. This means we will need to memorize these directories and every time we need to access them, we will need to navigate to these directories. The best way to overcome these is to tabulate the versions in Excel. But it is insufficient as it we are not able to start the software from the tables in excel natively.

By using VBA you can do this with the "Shell" command. The problem is with this is that you can only execute files that has no dependency on the folder you are executing it from. This means the environment paths and parameters needs to be setup correctly in Windows.

To overcome this limitation, I combined what I posted previously to build a table having the ability to execute a software from Excel. Instead of building your own DLL, what you need is to use the existing DLL (Refer to here for more details on how this is done in VBA):

Public Declare Function ShellExecute Lib "Shell32.dll" Alias "ShellExecuteA" _
  (ByVal hwnd As Long, _
   ByVal lpOperation As String, _
   ByVal lpFile As String, _
   ByVal lpParameters As String, _
   ByVal lpDirectory As String, _
   ByVal nShowCmd As Long) As Long

By declaring this you are effectively using the "ShellExecute" library from Shell32.dll to execute lpFile from lpDirectory.This will allow user to have more flexibility in building UI that will execute files upon clicking in VBA. An example of using the above is as follows:


RetVal = ShellExecute(0, "open", "C:\folder\client.cmd", "<arguments>", _
                        "C:\folder\", SW_SHOWMAXIMIZED)

Saturday, April 26, 2014

SABR Model - Introduction

The famous option pricing Black-Scholes framework assumes that the asset price follows the stochastic process:

$dS_{t} = rd_t+{\sigma}dw_t$
where $w_t$ is a Brownian motion and $r$ is the risk free rate.

This simply means the dynamics of asset price is a random variable following a lognormal distribution with constant volatility of ${\sigma}$. This evaluates to the option pricing formula:

$C({\sigma}, r, S_t, K) = S_t{N(d_1)} - KN(d_2)$
where $d_1 = \frac{(ln(\frac{S_t}{K})+\frac{1}{2}(r+\sigma^2)t}{\sigma\sqrt{t}}$ and $d_2 = d_1-\sigma\sqrt{t}$

However, in practice, the option price implied by the market indicates that there is different volatility for different strikes. Furthermore, stock prices has distribution with fat tail and has higher peak relative to Normal distribution. This implies that there is a mixture in distributions with different variances. This is the motivation for developing volatility models such as Local Volatility, Heston Model, SABR Model and etc. In this post, the SABR model will be introduced briefly. The SABR model is a stochastic volatility model developed by Patrick Hagan et al. It has the following dynamics:

$dF_t = F_t^{\beta} {\sigma}_t dW_t$ , $0≤{\beta}≤1$
$d{\sigma}_t = v{\sigma}_t dB_t$ , $v>0$
$d{<}W,B{>}_t = {\rho}dt$ , ${\rho}ϵ[-1,1]$
${\sigma}_0 = {\alpha} >0$

$where$
${\beta}$ : Describes the forward rate model.
$v$ : Lognormal volatility of volatility
${\rho}$ : Correlation between Forward rate and Volatility
${\alpha}$ : ATM volatility

This dynamic is clearly different from the Black Scholes framework and the market is quoted in Black's volatility for instruments such as cap/floor, swaptions and etc. What Hagan et al. did was to use this dynamics to price a European option. In the process of this evaluation, it is found that with some estimation techniques, the pricing using SABR dynamics is close to the normal model. From the normal model, it is then further derived that the pricing using SABR will approximate to Black's lognormal model.

The formula for the implied Black's volatility is as follows:

${\sigma}_{implied} (K,f)=\frac{\alpha}{((fK)^(\frac{(1-{\beta})}{2}) (1+\frac{(1-{\beta})^2}{24} log^2(\frac{f}{K})+\frac{(1-{\beta})^4}{1920} log^4((f/K)+⋯ ) ) )} \frac{z}{X(z)})$
$(1+\frac{(1-{\beta})^2}{24} \frac{{\alpha}^2}{(fK)^(1-{\beta})} + \frac{1}{4} \frac{{\alpha}{\beta}v{\rho}}{(fK)^\frac{1-{\beta}}{2}} +\frac{2-3{\rho}^2 }{24} v^2 ) t_ex+⋯)$
$where$
$z=\frac{v}{{\alpha}} (fK)^(\frac{1-{\beta}}{2})log(\frac{f}{K})$
$X(z)=log(\frac{\sqrt{1-2{\rho}z+z^2 }+z-{\rho})}{(1-{\rho})})$
This series of proving shows that Black's volatility quoted in the market can be modeled with SABR model. All that needs to be done is to calibrate the SABR parameters such that the volatility matches market quotes.

Saturday, December 28, 2013

Natural Cubic Spline interpolation (Application)

It has been a long journey from Theory to Derivation and now we are onto application. With all the theories and equations we obtained, we will now apply them and conduct interpolation.
For (n+1) points we now have 3 generic equations
$\frac{3(y_{i+2}-y_{i+1})}{(x_{i+2}-x_{i+1})^2} +\frac{3(y_{i+1}-y_i)}{(x_{i+1}-x_i)^2} = \frac{k_i}{x_{i+1}-x_i }+k_{i+1}[\frac{2}{x_{i+1}-x_i} +\frac{2}{x_{i+2}-x_{i+1}}]+\frac{k_{i+2}}{x_{i+2}-x_{i+1}}$           —(1)
$3\frac{y_{n+1}-y_n}{(x_{n+1}-x_n)^2}= \frac{2k_{n+1}}{x_{n+1}-x_n}+ \frac{k_n}{(x_{n+1}-x_n)}$ —(2)
$3\frac{y_1-y_0}{(x_1-x_0)^2} = \frac{2k_0}{x_1-x_0}+\frac{k_1}{x_1-x_0}$  —(3)
For i=[0,n-2]

The unknow here is k which is $f_i'(x_i)=c_i$, we have also seen that the rest of $a_i,b_i, d_i$ are all known as follows:
$a_i=\frac{1}{3}[\frac{(k_{i+1}-k_i)}{(x_{i+1}-x_i)^2}-\frac{2b_i}{(x_{i+1}-x_i )}]$ ---(4)
$b_i= \frac{3(y_{i+1}-y_i )}{(x_{i+1}-x_i)^2} - \frac{k_{i+1}+2k_i}{(x_{i+1}-x_i)}$    ---(5)
$c_i=k_i$    ---(6)
$d_i = y_i$    ---(7)

What is left now is to find k and it can be found by applying equation (1) - (3) and put them into matrix form and solve k using simple linear algebra.

$Wk = F$
$k_{(n+1)x1}=[k_0,k_1,...,k_n]^T$
$F_{(n+1)x1} =[ f_0,f_1,...,f_n]^T$
Where $f_i$ are the Left Hand sides of (1)-(3) and W is the (n+1)x(n+1) matrix formed by the Right Hand side of (1)-(3).

By obtaining the inverse of W,

$k = W^{-1} F$

we can solve for k.

To do interpolation for a value x, follow the following algorithm:

Step 1:
Find the interval $[x_i,x_{i+1}]$ by which x belongs.

Step 2:
Apply (4)-(7) to obtain the cubic equation coefficient.

Step 3:
Plug in the coefficients into the equation:
$y(x) = a_i(x-x_i)^3 + b_i(x-x_i)^2+c_i(x-x_i)+d_i$

Done! Now you got an estimated value of y(x)!

Tuesday, December 24, 2013

Natural Cubic Spline interpolation (Derivation)

Recap that in our previous post, we introduce how cubic spline interpolation can be applied to a n+1 points to produce a smooth curve. There will be a total of n curves each described by:
$f_i (x) = a_i(x-x_i)^3 + b_i(x-x_i)^2 + c_i (x-x_i) + d_i$

The conditions for a natural cubic spline is given by the following:

$f_i (x_i) = a_i(x-x_i)^3 + b_i(x-x_i)^2 + c_i (x-x_i)+d_i = y_i$    (1)
$f_i (x_i) = f_{i+1}(x_i)$                                                                         (2)
$f_i '(x_i) = f_{i+1}'(x_i)=c_i$                                                                (3)
$f_i ''(x_{i+1}) = f_{i+1}''(x_{i+1})=2b_i$                                                      (4)
$f_0''(x_0) = 0$                                                                                   (5)
$f_{n-1}''(x_{n-1}) = 0$                                                                            (6)

From here we will derive the solution to conduct cubic spline interpolation. The end results will be as specified in Wikipedia. The idea is the same as when you try to solve simultaneous equations and place them into systems of linear equations and solve them using simple linear algebra.

From (3) we are going to express everything in $f_i '(x_i) = f_{i+1} '(x_i) = k_{i+1}$. 
First we obtain $a_i$ in terms of $b_i$ and $k_i$

$f_{i+1}'(x_i)=c_i$     
$c_{i+1}-c_i= 3a_i (x_(i+1)-x_i )^2+2b_i (x_(i+1)-x_i )$ 
$a_i=\frac{1}{3}[\frac{(k_{i+1}-k_i)}{(x_{i+1}-x_i)^2}-\frac{2b_i}{(x_{i+1}-x_i )}]$ (7)

From (4) we will get,
$f_{i+1}''(x_{i+1})=2b_{i+1}=6a_i(x_{i+1}-x_i)+2b_i$
$b_{i+1}-b_i= 3a_i (x_{i+1}-x_i )$                                                                             (8)

Next we will substitute (7) into (8) 
$b_{i+1}-b_i= 3*\frac{1}{3}[ \frac{k_(i+1)-k_i}{(x_{i+1}-x_i)^2} - \frac{2b_i}{(x_{i+1}-x_i )}]*(x_{i+1}-x_i ) $
$b_{i+1}-b_i= \frac{(k_{i+1}-k_i)}{(x_{i+1}-x_i )}-2b_i $
$b_{i+1}+b_i= \frac{(k_{i+1}-k_i)}{(x_{i+1}-x_i )}$                                                                    — (9)

Equation (9) will be used in one of the final substitution so let's just hang on to equation (9) first and we will come back to this later. Now we try to substitute (1) and (7) into (2) to express $b_i$ in terms of $k_i$:

$y_{i+1}= a_i (x_{i+1}-x_i )^3+b_i (x_{i+1}-x_i)^2+c_i (x_{i+1}-x_i )+d_i$

$y_{i+1}-y_i= \frac{1}{3}[ \frac{(k_{i+1}-k_i)}{(x_{i+1}-x_i)^2} -\frac{2b_i}{(x_{i+1}-x_i )}](x_{i+1}-x_i )^3+b_i (x_{i+1}-x_i)^2+k_i (x_{i+1}-x_i )$

$\frac{(y_{i+1}-y_i)}{(x_{i+1}-x_i)^2} = \frac{1}{3}[\frac{(k_{i+1}-k_i)}{(x_{i+1}-x_i}-2b_i ]+b_i+\frac{k_i}{(x_{i+1}-x_i )}$

$\frac{3(y_{i+1}-y_i )}{(x_{i+1}-x_i)^2} = [\frac{(k_{i+1}-k_i)}{(x_{i+1}-x_i)}-2b_i ]+3b_i+\frac{3k_i}{x_{i+1}-x_i } $

$b_i= \frac{3(y_{i+1}-y_i )}{(x_{i+1}-x_i)^2} -  \frac{(k_{i+1}-k_i)}{x_{i+1}-x_i}-\frac{3k_i}{(x_{i+1}-x_i )}$
$b_i= \frac{3(y_{i+1}-y_i )}{(x_{i+1}-x_i)^2} - \frac{k_{i+1}+2k_i}{(x_{i+1}-x_i)}$  —(10)

Finally substitute (10) into (9) to get a equation with only ks, ys and xs.
$\frac{3(y_{i+2}-y_{i+1} )}{(x_{i+2}-x_{i+1})^2} -  \frac{(k_{i+2}+2k_{i+1})}{(x_{i+2}-x_{i+1})}+\frac{3(y_{i+1}-y_i )}{(x_{i+1}-x_i)^2} - \frac{(k_{i+1}+2k_i)}{(x_{i+1}-x_i)}= \frac{k_{i+1}-k_i}{x_{i+1}-x_i}$

$\frac{3(y_{i+2}-y_{i+1})}{(x_{i+2}-x_{i+1})^2} +\frac{3(y_{i+1}-y_i )}{(x_{i+1}-x_i)^2} = \frac{k_{i+1}-k_i}{x_{i+1}-x_i}+ \frac{k_{i+2}+2k_{i+1}}{x_{i+2}-x_{i+1}}+ \frac{k_{i+1}+2k_i}{x_{i+1}-x_i}$

$\frac{3(y_{i+2}-y_{i+1})}{(x_{i+2}-x_{i+1})^2} +\frac{3(y_{i+1}-y_i}{(x_{i+1}-x_i)^2} = \frac{k_i}{x_{i+1}-x_i }+k_{i+1}[\frac{2}{x_{i+1}-x_i} +\frac{2}{x_{i+2}-x_{i+1}}]+\frac{k_{i+2}}{x_{i+2}-x_{i+1}}$ —(11)

With this equation we have exactly (n-1) such equations for i=[0,n-2] and we have (n+1) unknown. By using the 2 other equation (5) and (6), we will obtain the other 2 equations.

Therefore, subsitute i=n and (10 )into (4) 
$0= 6a_n (x_{n+1}-x_n )+2b_n$
$0= 2[\frac{(k_{n+1}-k_n)}{(x_{n+1}-x_n)^2} -\frac{2b_n}{x_{n+1}-x_n }](x_{n+1}-x_n )+2b_n$
$b_n   =\frac{k_{n+1}-k_n}{x_{n+1}-x_n }$
$3\frac{y_{n+1}-y_n }{(x_{n+1}-x_n)^2} = \frac{k_{n+1}-k_n}{x_{n+1}-x_n }+
\frac{k_{n+1}+2k_n}{x_{n+1}-x_n}$
$3\frac{y_{n+1}-y_n}{(x_{n+1}-x_n)^2}= \frac{2k_{n+1}}{x_{n+1}-x_n}+ \frac{k_n}{(x_{n+1}-x_n)}$ —(12)

Substitute i=0 into (10),
$b_0= \frac{3(y_1-y_0 )}{(x_1-x_0)^2} - \frac{k_1+2k_0}{(x_i-x_0)}=0$
$3\frac{y_1-y_0}{(x_1-x_0)^2} = \frac{k_1+2k_0}{x_1-x_0}$
$3\frac{y_1-y_0}{(x_1-x_0)^2} = \frac{2k_0}{x_1-x_0}+\frac{k_1}{x_1-x_0}$  —(13)


Equations 11-13 is the exact equations you will see in Wikipedia. Now the obvious question is "How are we going to use all these equations we derived to do Cubic Spline Interpolation?". This shall be reviewed in the upcoming post.

Friday, December 20, 2013

Natural Cubic Spline interpolation (Theory)

I believe many would be very familiar with linear interpolation where we fix 2 points and we draw a line in between these 2 point. So the points in between are defined by a linear equation. Not many would come across and use cubic interpolation. Here is a quick guide on how it works.

In cubic spline, the idea is the same, we have 2 points and we want to find a cubic equation to describe the points in between such that we can use it to estimate the intermediate values. Now let us consider 4 points as shown below.

In this case, we have 4 points and we will need to model 3 equations; if we have n+1 points we will need n such equations  Each of these equation will look like this:

$f_i (x) = a_i (x-x_i)^3 + b_i (x-x_i)^2 + c_i (x-x_i) + d_i$

Thus there are 4n unknown we need to find. Here are the 4n equations that will help us determine these unknowns:

We know that at each equation has to fix the points hence we have the following:

$f_i (x_i) = a_i (x-x_i)_i^3 + b_i(x-x_i)_i^2 + c_i (x-x_i)_i + d_i$
where i = [0,...,n-1]

This implies that we have, 

$d_i = y_i$           ---(1a)
$f_0 (x_0) = y_0$    ---(1b)

This will give us n+1 equations.

To ensure that the equations are continuous/connecting, we need to ensure the end of the one equation will match the start of the next equation. Hence,

$f_i (x_i) = f_{i+1}(x_i)$ ; ---(2)
This will give us n-1 equations.

We also would want the slope, first derivative,  at the connecting points to be continuous or the same hence,

$f_i '(x_i) = f_{i+1}'(x_i)=c_i$ ---(3)
where i = [1,...,n-1]

This will give us n-1 equations.

We also want the second derivative  to be continuous or the same at connecting points, hence,

$f_i''(x_i) = 2b_i$
$f_i ''(x_i) = f_{i+1}''(x_i)=2b_i$ -- (4)

This will give us another n-1 equations.

Hence, now we have in total 4n-2 equations and we still need 2 more.

There are a few types of cubic spline and one of the most commonly seen is the natural cubic spline. For this we specify the final 2 equations:

$f_0''(x_0) = 0$ -- (5)
$f_{n-1}''(x_{n-1}) = 0$ -- (6)

Now we have 4n equations and we can go ahead to solve those variables. With those variables,we can now do interpolation with the equations:

$f_i (x) = a_i (x-x_i)^3 + b_i(x-x_i)^2 + c_i (x-x_i) + d_i$

Next, hang on to this theory first while I place all these into a system of linear equations.
And solve them using Matrix Algebra.