Archive for the ‘NAG Library’ Category

November 17th, 2014

Given a symmetric matrix such as

    \light \[ \left( \begin{array}{ccc} 1 & 1 & 0 \\ 1 & 1 & 1 \\ 0 & 1 & 1 \end{array} \right)\]

What’s the nearest correlation matrix? A 2002 paper by Manchester University’s Nick Higham which answered this question has turned out to be rather popular! At the time of writing, Google tells me that it’s been cited 394 times.

Last year, Nick wrote a blog post about the algorithm he used and included some MATLAB code. He also included links to applications of this algorithm and implementations of various NCM algorithms in languages such as MATLAB, R and SAS as well as details of the superb commercial implementation by The Numerical algorithms group.

I noticed that there was no Python implementation of Nick’s code so I ported it myself.

Here’s an example IPython session using the module

In [1]: from nearest_correlation import nearcorr

In [2]: import numpy as np

In [3]: A = np.array([[2, -1, 0, 0], 
   ...:               [-1, 2, -1, 0],
   ...:               [0, -1, 2, -1], 
   ...:               [0, 0, -1, 2]])

In [4]: X = nearcorr(A)

In [5]: X
array([[ 1.        , -0.8084125 ,  0.1915875 ,  0.10677505],
       [-0.8084125 ,  1.        , -0.65623269,  0.1915875 ],
       [ 0.1915875 , -0.65623269,  1.        , -0.8084125 ],
       [ 0.10677505,  0.1915875 , -0.8084125 ,  1.        ]])

This module is in the early stages and there is a lot of work to be done. For example, I’d like to include a lot more examples in the test suite, add support for the commercial routines from NAG and implement other algorithms such as the one by Qi and Sun among other things.

Hopefully, however, it is just good enough to be useful to someone. Help yourself and let me know if there are any problems. Thanks to Vedran Sego for many useful comments and suggestions.

May 15th, 2014

I’ve been a user and supporter of the commercial numerical libraries from NAG for several years now, using them in MATLAB, Fortran, C and Python among others.  They recently updated their C library to version 24 which now includes 1516 functions apparently.

NAG’s routines are fast, accurate, extremely well supported and often based on cutting edge numerical research (I know this because academics at my University, The University of Manchester, is responsible for some of said research). I often use functions from the NAG C library in MATLAB mex routines in order to help speed up researcher’s code.

Here’s some of the new functionality in Mark 24.  A full list of functions is available at

• Hypergeometric function (1f1 and 2f1)
• Nearest Correlation Matrix
• Elementwise weighted nearest correlation matrix
• Wavelet Transforms & FFTs
    Three dimensional discrete single level and multi-level wavelet transforms
    Fast Fourier Transforms (FFTs)  for two dimensional and three dimensional real data
• Matrix Functions
  Matrix square roots and general powers
  Matrix exponentials (Schur-Parlett)
  Fréchet Derivative
  Calculation of condition numbers
• Interpolation
— Interpolation for 5D and higher dimensions
• Optimization
 — Local optimization: Non-negative least squares
  Global optimization: Multi-start versions of general nonlinear programming and least squares routines
• Random Number Generatorss
— Brownian bridge and random fields
• Statistics
— Gaussian mixture model
— Best subsets of given size (branch and bound )
— Vectorized probabilities and probability density functions of distributions
— Inhomogeneous time series analysis, moving averages
— Routines that combines two sums of squares matrices to allow large datasets to be summarised

• Data fitting
 Fit of 2D scattered data by two-stage approximation (suitable for large datasets)
• Quadrature
  — 1D adaptive for badly-behaved integrals
• Sparse eigenproblem
  — Driver for real general matrix, driver for banded complex eigenproblem
  — Real and complex quadratic eigenvalue problems

• Sparse linear systems
   — block diagonal pre-conditioners and solvers
• ODE solvers
   — Threadsafe initial value ODE solvers
• Volatility
   — Heston model with term structure

December 17th, 2013

In a recent Stack Overflow query, someone asked if you could switch off the balancing step when calculating eigenvalues in Python. In the document A case where balancing is harmful, David S. Watkins describes the balancing step as ‘the input matrix A is replaced by a rescaled matrix A* = D-1AD, where D is a diagonal matrix chosen so that, for each i, the ith row and the ith column of A* have roughly the same norm.’  

Such balancing is usually very useful and so is performed by default by software such as  MATLAB or Numpy.  There are times, however, when one would like to switch it off.

In MATLAB, this is easy and the following is taken from the online MATLAB documentation

A = [ 3.0     -2.0      -0.9     2*eps;
     -2.0      4.0       1.0    -eps;
     -eps/4    eps/2    -1.0     0;
     -0.5     -0.5       0.1     1.0];

[VN,DN] = eig(A,'nobalance')
VN =

    0.6153   -0.4176   -0.0000   -0.1528
   -0.7881   -0.3261         0    0.1345
   -0.0000   -0.0000   -0.0000   -0.9781
    0.0189    0.8481   -1.0000    0.0443

DN =
    5.5616         0         0         0
         0    1.4384         0         0
         0         0    1.0000         0
         0         0         0   -1.0000

At the time of writing, it is not possible to directly do this in Numpy (as far as I know at least). Numpy’s eig command currently uses the LAPACK routine DGEEV to do the heavy lifting for double precision matrices.  We can see this by looking at the source code of numpy.linalg.eig where the relevant subsection is

lapack_routine = lapack_lite.dgeev
        wr = zeros((n,), t)
        wi = zeros((n,), t)
        vr = zeros((n, n), t)
        lwork = 1
        work = zeros((lwork,), t)
        results = lapack_routine(_N, _V, n, a, n, wr, wi,
                                  dummy, 1, vr, n, work, -1, 0)
        lwork = int(work[0])
        work = zeros((lwork,), t)
        results = lapack_routine(_N, _V, n, a, n, wr, wi,
                                  dummy, 1, vr, n, work, lwork, 0)

My plan was to figure out how to tell DGEEV not to perform the balancing step and I’d be done. Sadly, however, it turns out that this is not possible. Taking a look at the reference implementation of DGEEV, we can see that the balancing step is always performed and is not user controllable–here’s the relevant bit of Fortran

*     Balance the matrix
*     (Workspace: need N)
      IBAL = 1

So, using DGEEV is a dead-end unless we are willing to modifiy and recompile the lapack source — something that’s rarely a good idea in my experience. There is another LAPACK routine that is of use, however, in the form of DGEEVX that allows us to control balancing.  Unfortunately, this routine is not part of the numpy.linalg.lapack_lite interface provided by Numpy and I’ve yet to figure out how to add extra routines to it.

I’ve also discovered that this functionality is an open feature request in Numpy.

Enter the NAG Library

My University has a site license for the commercial Numerical Algorithms Group (NAG) library.  Among other things, NAG offers an interface to all of LAPACK along with an interface to Python.  So, I go through the installation and do

import numpy as np
from ctypes import *
from nag4py.util import Nag_RowMajor,Nag_NoBalancing,Nag_NotLeftVecs,Nag_RightVecs,Nag_RCondEigVecs,Integer,NagError,INIT_FAIL
from nag4py.f08 import nag_dgeevx

eps = np.spacing(1)

def unbalanced_eig(A):
    Compute the eigenvalues and right eigenvectors of a square array using DGEEVX via the NAG library.
    Requires the NAG C library and NAG's Python wrappers
    The balancing step that's performed in DGEEV is not performed here.
    As such, this function is the same as the MATLAB command eig(A,'nobalance')

    A : (M, M) Numpy array
        A square array of real elements.

        On exit: 
        A is overwritten and contains the real Schur form of the balanced version of the input matrix .

    w : (M,) ndarray
        The eigenvalues
    v : (M, M) ndarray
        The eigenvectors

    Author: Mike Croucher (
    Testing has been mimimal

    order = Nag_RowMajor
    balanc = Nag_NoBalancing
    jobvl = Nag_NotLeftVecs
    jobvr = Nag_RightVecs
    sense = Nag_RCondEigVecs

    n = A.shape[0]
    pda = n
    pdvl = 1

    wr = np.zeros(n)
    wi = np.zeros(n)

    pdvr = n
    vr = np.zeros(pdvr*n)


    scale = np.zeros(n)
    abnrm = c_double(0)
    rconde = np.zeros(n)
    rcondv = np.zeros(n)

    fail = NagError()

               n, A.ctypes.data_as(POINTER(c_double)), pda, wr.ctypes.data_as(POINTER(c_double)),
               vr.ctypes.data_as(POINTER(c_double)),pdvr,ilo,ihi, scale.ctypes.data_as(POINTER(c_double)),
               abnrm, rconde.ctypes.data_as(POINTER(c_double)),rcondv.ctypes.data_as(POINTER(c_double)),fail)

    if all(wi == 0.0):
            w = wr
            v = vr.reshape(n,n)
            w = wr+1j*wi
            v = array(vr, w.dtype).reshape(n,n)


Define a test matrix:

A = np.array([[3.0,-2.0,-0.9,2*eps],

Do the calculation

(w,v) = unbalanced_eig(A)

which gives

(array([ 5.5616,  1.4384,  1.    , -1.    ]),
 array([[ 0.6153, -0.4176, -0.    , -0.1528],
       [-0.7881, -0.3261,  0.    ,  0.1345],
       [-0.    , -0.    , -0.    , -0.9781],
       [ 0.0189,  0.8481, -1.    ,  0.0443]]))

This is exactly what you get by running the MATLAB command eig(A,’nobalance’).

Note that unbalanced_eig(A) changes the input matrix A to

array([[ 5.5616, -0.0662,  0.0571,  1.3399],
       [ 0.    ,  1.4384,  0.7017, -0.1561],
       [ 0.    ,  0.    ,  1.    , -0.0132],
       [ 0.    ,  0.    ,  0.    , -1.    ]])

According to the NAG documentation, this is the real Schur form of the balanced version of the input matrix.  I can’t see how to ask NAG to not do this. I guess that if it’s not what you want unbalanced_eig() to do,  you’ll need to pass a copy of the input matrix to NAG.

The IPython notebook

The code for this article is available as an IPython Notebook

The future

This blog post was written using Numpy version 1.7.1. There is an enhancement request for the functionality discussed in this article open in Numpy’s git repo and so I expect this article to become redundant pretty soon.

January 7th, 2013

The Numerical Algorithms Group (NAG) are principally known for their numerical library but they also offer products such as a MATLAB toolbox and a Fortran compiler.  My employer, The University of Manchester, has  a full site license for most of NAG’s stuff where it is heavily used by both our students and researchers.

While at a recent software conference, I saw a talk by NAG’s David Sayers where he demonstrated some of the features of the NAG Fortran Compiler.  During this talk he showed some examples of broken Fortran and asked us if we could spot how they were broken without compiler assistance.  I enjoyed the talk and so asked David if he would mind writing a guest blog post on the subject for WalkingRandomly.  He duly obliged.

 This is a guest blog post by David Sayers of NAG.

What do you want from your Fortran compiler? Some people ask for extra (non-standard) features, others require very fast execution speed. The very latest extensions to the Fortran language appeal to those who like to be up to date with their code.

I suspect that very few would put enforcement of the Fortran standard at the top of their list, yet this essential if problems are to be avoided in the future. Code written specifically for one compiler is unlikely to work when computers change, or may contain errors that appear only intermittently. Without access to at least one good checking compiler, the developer or support desk will be lacking a valuable tool in the fight against faulty code.

The NAG Fortran compiler is such a tool. It is used extensively by NAG’s own staff to validate their library code and to answer user-support queries involving user’s Fortran programs. It is available on Windows, where it has its own IDE called Fortran Builder, and on Unix platforms and Mac OS X.

Windows users also have the benefit of some Fortran Tools bundled in to the IDE. Particularly nice is the Fortran polisher which tidies up the presentation of your source files according to user-specified preferences.

The compiler includes most Fortran 2003 features, very many Fortran 2008 features and the most commonly used features of OpenMP 3.0 are supported.

The principal developer of the compiler is Malcolm Cohen, co-author of the book, Modern Fortran Explained along with Michael Metcalf and John Reid. Malcolm has been a member of the international working group on Fortran, ISO/IEC JTC1/SC22/WG5, since 1988, and the USA technical subcommittee on Fortran, J3, since 1994. He has been head of the J3 /DATA subgroup since 1998 and was responsible for the design and development of the object-oriented features in Fortran 2003. Since 2005 he has been Project Editor for the ISO/IEC Fortran standard, which has continued its evolution with the publication of the Fortran 2008 standard in 2010.

Of all people Malcolm Cohen should know Fortran and the way the standard should be enforced!

His compiler reflects that knowledge and is designed to assist the programmer to detect how programs might be faulty due to a departure from the Fortran standard or prone to trigger a run time error. In either case the diagnostics of produced by the compiler are clear and helpful and can save the developer many hours of laborious bug-tracing. Here are some particularly simple examples of faulty programs. See if you can spot the mistakes, and think how difficult these might be to detect in programs that may be thousands of times longer:

Example 1

    Program test
      Real, Pointer :: x(:, :)

      Call make_dangle
      x(10, 10) = 0
      Subroutine make_dangle
        Real, Target :: y(100, 200)

        x => y
      End Subroutine make_dangle
    End Program test

Example 2

Program dangle2
Real,Pointer :: x(:),y(:)
y => x
y = 3

Example 3

      program more
      integer n, i
      real r, s
      equivalence (n,r)
      write(6,900) i, r
900  format(' i = ', i5, '   r = ', f10.4)
      stop 'ok'

Example 4

      program trouble1
      integer n
      parameter (n=11)
      integer iarray(n) 
      integer i
      do 10 i=1,10
        iarray(i) = i
10   continue
      write(6,900) iarray
900  format(' iarray = ',11i5)
      stop 'ok'

And finally if this is all too easy …

Example 5

!   E04UCA Example Program Text
!   Mark 23 Release. NAG Copyright 2011.

    MODULE e04ucae_mod

!      E04UCA Example Program Module:
!             Parameters and User-defined Routines

!      .. Use Statements ..
       USE nag_library, ONLY : nag_wp
!      .. Implicit None Statement ..
!      .. Parameters ..
       REAL (KIND=nag_wp), PARAMETER       :: one = 1.0_nag_wp
       REAL (KIND=nag_wp), PARAMETER       :: zero = 0.0_nag_wp
       INTEGER, PARAMETER                  :: inc1 = 1, lcwsav = 1,            &
                                              liwsav = 610, llwsav = 120,      &
                                              lrwsav = 475, nin = 5, nout = 6
       SUBROUTINE objfun(mode,n,x,objf,objgrd,nstate,iuser,ruser)
!         Routine to evaluate objective function and its 1st derivatives.

!         .. Implicit None Statement ..
!         .. Scalar Arguments ..
          REAL (KIND=nag_wp), INTENT (OUT)    :: objf
          INTEGER, INTENT (INOUT)             :: mode
          INTEGER, INTENT (IN)                :: n, nstate
!         .. Array Arguments ..
          REAL (KIND=nag_wp), INTENT (INOUT)  :: objgrd(n), ruser(*)
          REAL (KIND=nag_wp), INTENT (IN)     :: x(n)
          INTEGER, INTENT (INOUT)             :: iuser(*)
!         .. Executable Statements ..
          IF (mode==0 .OR. mode==2) THEN
             objf = x(1)*x(4)*(x(1)+x(2)+x(3)) + x(3)
          END IF

          IF (mode==1 .OR. mode==2) THEN
             objgrd(1) = x(4)*(x(1)+x(1)+x(2)+x(3))
             objgrd(2) = x(1)*x(4)
             objgrd(3) = x(1)*x(4) + one
             objgrd(4) = x(1)*(x(1)+x(2)+x(3))
          END IF


       END SUBROUTINE objfun
       SUBROUTINE confun(mode,ncnln,n,ldcj,needc,x,c,cjac,nstate,iuser,ruser)
!         Routine to evaluate the nonlinear constraints and their 1st
!         derivatives.

!         .. Implicit None Statement ..
!         .. Scalar Arguments ..
          INTEGER, INTENT (IN)                :: ldcj, n, ncnln, nstate
          INTEGER, INTENT (INOUT)             :: mode
!         .. Array Arguments ..
          REAL (KIND=nag_wp), INTENT (OUT)    :: c(ncnln)
          REAL (KIND=nag_wp), INTENT (INOUT)  :: cjac(ldcj,n), ruser(*)
          REAL (KIND=nag_wp), INTENT (IN)     :: x(n)
          INTEGER, INTENT (INOUT)             :: iuser(*)
          INTEGER, INTENT (IN)                :: needc(ncnln)
!         .. Executable Statements ..
          IF (nstate==1) THEN

!            First call to CONFUN.  Set all Jacobian elements to zero.
!            Note that this will only work when 'Derivative Level = 3'
!            (the default; see Section 11.2).

             cjac(1:ncnln,1:n) = zero
          END IF

          IF (needc(1)>0) THEN

             IF (mode==0 .OR. mode==2) THEN
                c(1) = x(1)**2 + x(2)**2 + x(3)**2 + x(4)**2
             END IF

             IF (mode==1 .OR. mode==2) THEN
                cjac(1,1) = x(1) + x(1)
                cjac(1,2) = x(2) + x(2)
                cjac(1,3) = x(3) + x(3)
                cjac(1,4) = x(4) + x(4)
             END IF

          END IF

          IF (needc(2)>0) THEN

             IF (mode==0 .OR. mode==2) THEN
                c(2) = x(1)*x(2)*x(3)*x(4)
             END IF

             IF (mode==1 .OR. mode==2) THEN
                cjac(2,1) = x(2)*x(3)*x(4)
                cjac(2,2) = x(1)*x(3)*x(4)
                cjac(2,3) = x(1)*x(2)*x(4)
                cjac(2,4) = x(1)*x(2)*x(3)
             END IF

          END IF


       END SUBROUTINE confun
    END MODULE e04ucae_mod
    PROGRAM e04ucae

!      E04UCA Example Main Program

!      .. Use Statements ..
       USE nag_library, ONLY : dgemv, e04uca, e04wbf, nag_wp
       USE e04ucae_mod, ONLY : confun, inc1, lcwsav, liwsav, llwsav, lrwsav,   &
                               nin, nout, objfun, one, zero
!      .. Implicit None Statement ..
!      .. Local Scalars ..
      ! REAL (KIND=nag_wp)                  :: objf
       INTEGER                             :: i, ifail, iter, j, lda, ldcj,    &
                                              ldr, liwork, lwork, n, nclin,    &
                                              ncnln, sda, sdcjac
!      .. Local Arrays ..
       REAL (KIND=nag_wp), ALLOCATABLE     :: a(:,:), bl(:), bu(:), c(:),      &
                                              cjac(:,:), clamda(:), objgrd(:), &
                                              r(:,:), work(:), x(:)
       REAL (KIND=nag_wp)                  :: ruser(1), rwsav(lrwsav)
       INTEGER, ALLOCATABLE                :: istate(:), iwork(:)
       INTEGER                             :: iuser(1), iwsav(liwsav)
       LOGICAL                             :: lwsav(llwsav)
       CHARACTER (80)                      :: cwsav(lcwsav)
!      .. Intrinsic Functions ..
       INTRINSIC                              max
!      .. Executable Statements ..
       WRITE (nout,*) 'E04UCA Example Program Results'

!      Skip heading in data file
       READ (nin,*)

       READ (nin,*) n, nclin, ncnln
       liwork = 3*n + nclin + 2*ncnln
       lda = max(1,nclin)

       IF (nclin>0) THEN
          sda = n
          sda = 1
       END IF

       ldcj = max(1,ncnln)

       IF (ncnln>0) THEN
          sdcjac = n
          sdcjac = 1
       END IF

       ldr = n

       IF (ncnln==0 .AND. nclin>0) THEN
          lwork = 2*n**2 + 20*n + 11*nclin
       ELSE IF (ncnln>0 .AND. nclin>=0) THEN
          lwork = 2*n**2 + n*nclin + 2*n*ncnln + 20*n + 11*nclin + 21*ncnln
          lwork = 20*n
       END IF

       ALLOCATE (istate(n+nclin+ncnln),iwork(liwork),a(lda,sda), &
          bl(n+nclin+ncnln),bu(n+nclin+ncnln),c(max(1, &
          ncnln)),cjac(ldcj,sdcjac),clamda(n+nclin+ncnln),objgrd(n),r(ldr,n), &

       IF (nclin>0) THEN
          READ (nin,*) (a(i,1:sda),i=1,nclin)
       END IF

       READ (nin,*) bl(1:(n+nclin+ncnln))
       READ (nin,*) bu(1:(n+nclin+ncnln))
       READ (nin,*) x(1:n)

!      Initialise E04UCA

       ifail = 0
       CALL e04wbf('E04UCA',cwsav,lcwsav,lwsav,llwsav,iwsav,liwsav,rwsav, &

!      Solve the problem

       ifail = -1
       CALL e04uca(n,nclin,ncnln,lda,ldcj,ldr,a,bl,bu,confun,objfun,iter, &
          istate,c,cjac,clamda,objf,objgrd,r,x,iwork,liwork,work,lwork,iuser, &

       SELECT CASE (ifail)
       CASE (0:6,8)
          WRITE (nout,*)
          WRITE (nout,99999)
          WRITE (nout,*)

          DO i = 1, n
             WRITE (nout,99998) i, istate(i), x(i), clamda(i)
          END DO

          IF (nclin>0) THEN

!            A*x --> work.
!            The NAG name equivalent of dgemv is f06paf
             CALL dgemv('N',nclin,n,one,a,lda,x,inc1,zero,work,inc1)

             WRITE (nout,*)
             WRITE (nout,*)
             WRITE (nout,99997)
             WRITE (nout,*)

             DO i = n + 1, n + nclin
                j = i - n
                WRITE (nout,99996) j, istate(i), work(j), clamda(i)
             END DO

          END IF

          IF (ncnln>0) THEN
             WRITE (nout,*)
             WRITE (nout,*)
             WRITE (nout,99995)
             WRITE (nout,*)

             DO i = n + nclin + 1, n + nclin + ncnln
                j = i - n - nclin
                WRITE (nout,99994) j, istate(i), c(j), clamda(i)
             END DO

          END IF

          WRITE (nout,*)
          WRITE (nout,*)
          WRITE (nout,99993) objf

99999  FORMAT (1X,'Varbl',2X,'Istate',3X,'Value',9X,'Lagr Mult')
99998  FORMAT (1X,'V',2(1X,I3),4X,1P,G14.6,2X,1P,G12.4)
99997  FORMAT (1X,'L Con',2X,'Istate',3X,'Value',9X,'Lagr Mult')
99996  FORMAT (1X,'L',2(1X,I3),4X,1P,G14.6,2X,1P,G12.4)
99995  FORMAT (1X,'N Con',2X,'Istate',3X,'Value',9X,'Lagr Mult')
99994  FORMAT (1X,'N',2(1X,I3),4X,1P,G14.6,2X,1P,G12.4)
99993  FORMAT (1X,'Final objective value = ',1P,G15.7)
    END PROGRAM e04ucae

Answers to this particular New Year quiz will be posted in a future blog post.

July 23rd, 2012

A MATLAB user at Manchester University contacted me recently asking about Black-Scholes option pricing.  The MATLAB Financial Toolbox has a range of functions that can calculate Black-Scholes put and call option prices along with several of the sensitivities (or ‘greeks‘) such as blsprice, blsdelta and so on.

The user’s problem is that we don’t have any site-wide licenses for the Financial Toolbox.  We do, however, have a full site license for the NAG Toolbox for MATLAB which has a nice set of option pricing routines.  Even though they calculate the same things, NAG Toolbox option pricing functions look very different to the Financial Toolbox ones and so I felt that a Rosetta Stone type article might be useful.

For Black-Scholes option pricing, there are three main differences between the two systems:

  1. The Financial Toolbox has separate functions for calculating the option price and each greek (e.g. blsprice, blsgamma, blsdelta etc) whereas NAG calculates the price and all greeks simultaneously with a single function call.
  2. Where appropriate, The MATLAB functions calculate Put and Call values with one function call whereas with NAG you need to explicitly specify Call or Put.
  3. NAG calculates more greeks than MATLAB.

The following code example pretty much says it all.  Any variable calculated with the NAG Toolbox is prefixed NAG_ whereas anything calculated with the financial toolbox is prefixed MW_.  When I developed this, I was using MATLAB 2012a with NAG Toolbox Mark 22.

%Input parameters for both NAG and MATLAB.

%calculate all greeks for a put using NAG
[NAG_Put, NAG_PutDelta, NAG_Gamma, NAG_Vega, NAG_PutTheta, NAG_PutRho, NAG_PutCrho, NAG_PutVanna,...
NAG_PutCharm, NAG_PutSpeed, NAG_PutColour, NAG_PutZomma,NAG_PutVomma, ifail] =...
 s30ab('p', Strike, Price, Time, Volatility, Rate, Yield);

%calculate all greeks for a Call using NAG
[NAG_Call, NAG_CallDelta, NAG_Gamma, NAG_Vega, NAG_CallTheta, NAG_CallRho, NAG_CallCrho, NAG_CallVanna,...
 NAG_CallCharm, NAG_CallSpeed, NAG_CallColour,NAG_CallZomma, NAG_CallVomma, ifail] = ...
s30ab('c', Strike, Price, Time, Volatility, Rate, Yield);

%Calculate the Elasticity (Lambda)
NAG_CallLambda = Price/NAG_Call*NAG_CallDelta;
NAG_PutLambda = Price/NAG_Put*NAG_PutDelta;

%Calculate the same set of prices and greeks using the MATLAB Finance Toolbox
[MW_Call, MW_Put] = blsprice(Price, Strike, Rate, Time, Volatility, Yield);
[MW_CallDelta, MW_PutDelta] = blsdelta(Price, Strike, Rate, Time, Volatility, Yield);
MW_Gamma = blsgamma(Price, Strike, Rate, Time, Volatility, Yield);
MW_Vega = blsvega(Price, Strike, Rate, Time, Volatility, Yield);
[MW_CallTheta, MW_PutTheta] = blstheta(Price, Strike, Rate, Time,Volatility, Yield);
[MW_CallRho, MW_PutRho]= blsrho(Price, Strike, Rate, Time, Volatility,Yield);
[MW_CallLambda,MW_PutLambda]=blslambda(Price, Strike, Rate, Time, Volatility,Yield);

Note that NAG doesn’t output the elasticity (Lambda) directly but it is trivial to obtain it using values that it does output. Also note that as far as I can tell, NAG outputs more Greeks than the Financial Toolbox does.

I’m not going to show the entire output of the above program because there are a lot of numbers.  However, here are the Put values as calculated by NAG shown to 4 decimal places. I have checked and they agree with the Financial Toolbox to within numerical noise.

NAG_Put =0.1350
NAG_PutDelta =-0.0419
NAG_PutLambda =-15.5066
NAG_Gamma =0.0119
NAG_Vega =2.2361
NAG_PutTheta =-1.1187
NAG_PutRho =-0.5572
NAG_PutCrho = -0.5235
NAG_PutVanna =-0.4709
NAG_PutCharm =0.2229
NAG_PutSpeed =-0.0030
NAG_PutColour =-0.0275
NAG_PutZomma =0.0688
NAG_PutVomma =20.3560
April 11th, 2012

I recently installed MATLAB 2012a on a Windows machine along with a certain set of standard Mathworks toolboxes.  In addition, I also installed the excellent NAG Toolbox for MATLAB which is standard practice at my University.

I later realised that I had not installed all of the Mathworks toolboxes I needed so I fired up the MATLAB installer again and asked it to add the missing toolboxes.  This extra installation never completed, however, and gave me the error message

The application encountered an unexpected error and needed to close. You may want to try
re-installing your product(s). More infomation can be found at C:\path_to_a_log_file

I took a look at the log file mentioned which revealed a huge java error that began with

 java.util.concurrent.ExecutionException: java.lang.StringIndexOutOfBoundsException:
    String index out of range: -2
 at java.util.concurrent.FutureTask$Sync.innerGet(Unknown Source)
 at java.util.concurrent.FutureTask.get(Unknown Source)
 at javax.swing.SwingWorker.get(Unknown Source)
 at com.mathworks.wizard.worker.WorkerImpl.done(
 at javax.swing.SwingWorker$ Source)

A little mucking around revealed that the installer was unhappy with the pathdef.m file at C:\Program Files\MATLAB\R2012a\toolbox\local\pathdef.m

The installer for the NAG Toolbox modifies this file by adding the line

'C:\Program Files\MATLAB\R2012a\toolbox\NAG\mex.w64;' ...

near the beginning and the lines

'C:\Program Files\MATLAB\R2012a\help\toolbox\NAG;' ...
'C:\Program Files\MATLAB\R2012a\help\toolbox\NAGToolboxDemos;' ...

Near the end and it seems that the MATLAB installer really doesn’t like this. So, what you do is create a copy of this pathdef.m file (pathdef.m.old for example) and then remove the non-mathworks lines in pathdef.m. Now you can install the extra Mathworks toolboxes you want.  Once the installer has finished its work you can re-add the non-mathworks lines back into pathdef.m using your copy as a guide.

I’ll be informing both NAG and The Mathworks about this particular issue but wanted to get this post out there as soon as possible to provide a workaround since at least one other person has hit this problem at my University and I doubt that he will be the last (It’s also going to make SCCM deployment of MATLAB a pain but that’s another story).


The Mathworks technical support have sent me a better workaround than the one detailed above. What you need to do is to change

'C:\Program Files\MATLAB\R2012a\toolbox\NAG\mex.w64;' ...


'C:\Program Files\MATLAB\R2012a\toolbox\NAG\mex.w64;', ...

The Mathworks installer is unhappy about the missing comma.

October 24th, 2011

The NAG C Library is one of the largest commercial collections of numerical software currently available and I often find it very useful when writing MATLAB mex files.  “Why is that?” I hear you ask.

One of the main reasons for writing a mex file is to gain more speed over native MATLAB.  However, one of the main problems with writing mex files is that you have to do it in a low level language such as Fortran or C and so you lose much of the ease of use of MATLAB.

In particular, you lose straightforward access to most of the massive collections of MATLAB routines that you take for granted.  Technically speaking that’s a lie because you could use the mex function mexCallMATLAB to call a MATLAB routine from within your mex file but then you’ll be paying a time overhead every time you go in and out of the mex interface.  Since you are going down the mex route in order to gain speed, this doesn’t seem like the best idea in the world.  This is also the reason why you’d use the NAG C Library and not the NAG Toolbox for MATLAB when writing mex functions.

One way out that I use often is to take advantage of the NAG C library and it turns out that it is extremely easy to add the NAG C library to your mex projects on Windows.  Let’s look at a trivial example.  The following code, nag_normcdf.c, uses the NAG function nag_cumul_normal to produce a simplified version of MATLAB’s normcdf function (laziness is all that prevented me from implementing a full replacement).

/* A simplified version of normcdf that uses the NAG C library
 * Written to demonstrate how to compile MATLAB mex files that use the NAG C Library
 * Only returns a normcdf where mu=0 and sigma=1
 * October 2011 Michael Croucher

#include <math.h>
#include "mex.h"
#include "nag.h"
#include "nags.h"

void mexFunction( int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[] )
    double *in,*out;
    int rows,cols,num_elements,i;

        mexErrMsgIdAndTxt("NAG:BadArgs","This simplified version of normcdf only takes 1 input argument");

    /*Get pointers to input matrix*/
    in = mxGetPr(prhs[0]);
    /*Get rows and columns of input matrix*/
    rows = mxGetM(prhs[0]);
    cols = mxGetN(prhs[0]);
    num_elements = rows*cols;

    /* Create output matrix */
    plhs[0] = mxCreateDoubleMatrix(rows, cols, mxREAL);
    /* Assign pointer to the output */
    out = mxGetPr(plhs[0]);

    for(i=0; i<num_elements; i++){
          out[i] = nag_cumul_normal(in[i]);

To compile this in MATLAB, just use the following command

 mex nag_normcdf.c CLW6I09DA_nag.lib

If your system is set up the same as mine then the above should ‘just work’ (see System Information at the bottom of this post).  The new function works just as you would expect it to

>> format long
>> format compact
>> nag_normcdf(1)
ans =

Compare the result to normcdf from the statistics toolbox

>> normcdf(1)
ans =

So far so good. I could stop the post here since all I really wanted to do was say ‘The NAG C library is useful for MATLAB mex functions and it’s a doddle to use – here’s a toy example and here’s the mex command to compile it’

However, out of curiosity, I looked to see if my toy version of normcdf was any faster than The Mathworks’ version.  Let there be 10 million numbers:

>> x=rand(1,10000000);

Let’s time how long it takes MATLAB to take the normcdf of those numbers

>> tic;y=normcdf(x);toc
Elapsed time is 0.445883 seconds.
>> tic;y=normcdf(x);toc
Elapsed time is 0.405764 seconds.
>> tic;y=normcdf(x);toc
Elapsed time is 0.366708 seconds.
>> tic;y=normcdf(x);toc
Elapsed time is 0.409375 seconds.

Now let’s look at my toy-version that uses NAG.

>> tic;y=nag_normcdf(x);toc
Elapsed time is 0.544642 seconds.
>> tic;y=nag_normcdf(x);toc
Elapsed time is 0.556883 seconds.
>> tic;y=nag_normcdf(x);toc
Elapsed time is 0.553920 seconds.
>> tic;y=nag_normcdf(x);toc
Elapsed time is 0.540510 seconds.

So my version is slower!  Never mind, I’ll just make my version parallel using OpenMP – Here is the code: nag_normcdf_openmp.c

/* A simplified version of normcdf that uses the NAG C library
 * Written to demonstrate how to compile MATLAB mex files that use the NAG C Library
 * Only returns a normcdf where mu=0 and sigma=1
 * October 2011 Michael Croucher

#include <math.h>
#include "mex.h"
#include "nag.h"
#include "nags.h"
#include <omp.h>

void do_calculation(double in[],double out[],int num_elements)
    int i,tid;

#pragma omp parallel for shared(in,out,num_elements) private(i,tid)
    for(i=0; i<num_elements; i++){
          out[i] = nag_cumul_normal(in[i]);

void mexFunction( int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[] )
    double *in,*out;
    int rows,cols,num_elements;

        mexErrMsgIdAndTxt("NAG_NORMCDF:BadArgs","This simplified version of normcdf only takes 1 input argument");

    /*Get pointers to input matrix*/
    in = mxGetPr(prhs[0]);
    /*Get rows and columns of input matrix*/
    rows = mxGetM(prhs[0]);
    cols = mxGetN(prhs[0]);
    num_elements = rows*cols;

    /* Create output matrix */
    plhs[0] = mxCreateDoubleMatrix(rows, cols, mxREAL);
    /* Assign pointer to the output */
    out = mxGetPr(plhs[0]);


Compile that with

 mex  COMPFLAGS="$COMPFLAGS /openmp"  nag_normcdf_openmp.c CLW6I09DA_nag.lib

and on my quad-core machine I get the following timings

>> tic;y=nag_normcdf_openmp(x);toc
Elapsed time is 0.237925 seconds.
>> tic;y=nag_normcdf_openmp(x);toc
Elapsed time is 0.197531 seconds.
>> tic;y=nag_normcdf_openmp(x);toc
Elapsed time is 0.206511 seconds.
>> tic;y=nag_normcdf_openmp(x);toc
Elapsed time is 0.211416 seconds.

This is faster than MATLAB and so normal service is resumed :)

System Information

  • 64bit Windows 7
  • MATLAB 2011b
  • NAG C Library Mark 9 – CLW6I09DAL
  • Visual Studio 2008
  • Intel Core i7-2630QM processor
May 18th, 2011

MATLAB’s lsqcurvefit function is a very useful piece of code that will help you solve non-linear least squares curve fitting problems and it is used a lot by researchers at my workplace, The University of Manchester.  As far as we are concerned it has two problems:

  • lsqcurvefit is part of the Optimisation toolbox and, since we only have a limited number of licenses for that toolbox, the function is sometimes inaccessible.  When we run out of licenses on campus users get the following error message
??? License checkout failed.
License Manager Error -4
Maximum number of users for Optimization_Toolbox reached.
Try again later.
To see a list of current users use the lmstat utility or contact your License Administrator.
  • lsqcurvefit is written as an .m file and so isn’t as fast as it could be.

One solution to these problems is to switch to the NAG Toolbox for MATLAB.  Since we have a full site license for this product, we never run out of licenses and, since it is written in compiled Fortran, it is sometimes a lot faster than MATLAB itself.  However, the current version of the NAG Toolbox (Mark 22 at the time of writing) isn’t without its issues either:

  • There is no direct equivalent to lsqcurvefit in the NAG toolbox.  You have to use NAG’s equivalent of lsqnonlin instead (which NAG call e04fy)
  • The current version of the NAG Toolbox, Mark 22, doesn’t support function handles.
  • The NAG toolbox requires the use of the datatypes int32 and int64 depending on the architecture of your machine.  The practical upshot of this is that your MATLAB code is suddenly a lot less portable.  If you develop on a 64 bit machine then it will need modifying to run on a 32 bit machine and vice-versa.

While working on optimising someone’s code a few months ago I put together a couple of .m files in an attempt to address these issues.  My intent was to improve the functionality of these files and eventually publish them but I never seemed to find the time.  However, they have turned out to be a minor hit and I’ve sent them out to researcher after researcher with the caveat “These are a first draft, I need to tidy them up sometime” only to get the reply “Thanks for that, I no longer have any license problems and my code is executing more quickly.”  They may be simple but it seems that they are useful.

So, until I find the time to add more functionality, here are my little wrappers that allow you to do non linear least squares fitting using the NAG Toolbox for MATLAB.  They are about as minimal as you can get but they work and have proven to be useful time and time again.  They also solve all 3 of the NAG issues mentioned above.

To use them just put them both somewhere on your MATLAB path.  The syntax is identical to lsqcurvefit but this first version of my wrapper doesn’t do anything more complicated than the following


[x,resnorm] = lsqcurvefit(@myfun,x0,xdata,ydata);

NAG with my wrapper:

[x,resnorm] = nag_lsqcurvefit(@myfun,x0,xdata,ydata);

I may add extra functionality in the future but it depends upon demand.


Let’s look at the example given on the MATLAB help page for lsqcurvefit and compare it to the NAG version.  First create a file called myfun.m as follows

function F = myfun(x,xdata)
F = x(1)*exp(x(2)*xdata);

Now create the data and call the MATLAB fitting function

% Assume you determined xdata and ydata experimentally
xdata = [0.9 1.5 13.8 19.8 24.1 28.2 35.2 60.3 74.6 81.3];
ydata = [455.2 428.6 124.1 67.3 43.2 28.1 13.1 -0.4 -1.3 -1.5];
x0 = [100; -1] % Starting guess
[x,resnorm] = lsqcurvefit(@myfun,x0,xdata,ydata);

On my system I get the following timing for MATLAB (typical over around 10 runs)

tic;[x,resnorm] = lsqcurvefit(@myfun,x0,xdata,ydata); toc
Elapsed time is 0.075685 seconds.

with the following results

x =
   1.0e+02 *
>> resnorm
resnorm =

and for NAG using my wrapper function I get the following timing (typical over around 10 runs)

tic;[x,resnorm] = nag_lsqcurvefit(@myfun,x0,xdata,ydata); toc
Elapsed time is 0.008163 seconds.

So, for this example, NAG is around 9 times faster! The results agree with MATLAB to several decimal places

x =
   1.0e+02 *
>> resnorm
resnorm =

In the real world I find that the relative timings vary enormously and have seen speed-ups that range from a factor of 14 down to none at all.  Whenever I am optimisng MATLAB code and see a lsqcurvefit function I always give the NAG version a quick try and am often impressed with the results.

My system specs if you want to compare results: 64bit Linux on a Intel Core 2 Quad Q9650  @ 3.00GHz running MATLAB 2010b and NAG Toolbox version MBL6A22DJL

More articles about NAG

November 13th, 2010

Christmas isn’t all that far away so I thought that it was high time that I wrote my Christmas list for mathematical software developers and vendors.  All I want for christmas is….



  • Merge the parallel computing toolbox with core MATLAB.  Everyone uses multicore these days but only a few can feel the full benefit in MATLAB.   The rest are essentially second class MATLAB citizens muddling by with a single core (most of the time)
  • Make the mex interface thread safe so I can more easily write parallel mex files


  • More CUDA accelerated functions please.  I was initially excited by your CUDA package but then discovered that it only accelerated one function (Matrix Multiply).  CUDA accelerated Random Number Generators would be nice along with fast Fourier transforms and a bit more linear algebra.


  • Release Mathcad Prime.
  • Mac and Linux versions of Mathcad.  Maple,Mathematica and MATLAB have versions for all 3 platforms so why don’t you?

NAG Library

  • Produce vector versions of functions like g01bk (poisson distribution function).  They might not be needed in Fortran or C code but your MATLAB toolbox desperately needs them
  • A Mac version of the MATLAB toolbox.  I’ve got users practically begging for it :)
  • A NAG version of the MATLAB gamfit command



  • A Locator control for the interact function.  I still have a bounty outstanding for the person who implements this.
  • A fully featured, native windows version.  I know about the VM solution and it isn’t suitable for what I want to do (which is to deploy it on around 5000 University windows machines to introduce students to one of the best open source maths packages)

SMath Studio

  • An Android version please.  Don’t make it free – you deserve some money for this awesome Mathcad alternative.

SpaceTime Mathematics

  • The fact that you give the Windows version away for free is awesome but registration is a pain when you are dealing with mass deployment.  I’d love to deploy this to my University’s Windows desktop image but the per-machine registration requirement makes it difficult.  Most large developers who require registration usually come up with an alternative mechanism for enterprise-wide deployment.  You ask schools with more than 5 machines to link back to you.  I want tot put it on a few thousand machines and I would happily link back to you from several locations if you’ll help me with some sort of volume license.  I’ll also give internal (and external if anyone is interested) seminars at Manchester on why I think Spacetime is useful for teaching mathematics.  Finally, I’d encourage other UK University applications specialists to evaluate the software too.
  • An Android version please.

How about you?  What would you ask for Christmas from your favourite mathematical software developers?

October 5th, 2010

Not long after publishing my article, A faster version of MATLAB’s fsolve using the NAG Toolbox for MATLAB, I was contacted by Biao Guo of Quantitative Finance Collector who asked me what I could do with the following piece of code.

function MarkowitzWeight = fsolve_markowitz_MATLAB()

RandStream.setDefaultStream (RandStream('mt19937ar','seed',1));
% simulate 500 stocks return
SimR = randn(1000,500);
r = mean(SimR); % use mean return as expected return
targetR = 0.02;
rCOV = cov(SimR); % covariance matrix
NumAsset = length(r);

startX = [1/NumAsset*ones(1, NumAsset), 1, 1];
x = fsolve(@fsolve_markowitz,startX,options, r, targetR, rCOV);
MarkowitzWeight = x(1:NumAsset);

function F = fsolve_markowitz(x, r, targetR, rCOV)
NumAsset = length(r);
F = zeros(1,NumAsset+2);
weight = x(1:NumAsset); % for asset weights
lambda = x(NumAsset+1);
mu = x(NumAsset+2);
for i = 1:NumAsset
    F(i) = weight(i)*sum(rCOV(i,:))-lambda*r(i)-mu;
F(NumAsset+1) = sum(weight.*r)-targetR;
F(NumAsset+2) = sum(weight)-1;

This is an example from financial mathematics and Biao explains it in more detail in a post over at his blog, Now, it isn’t particularly computationally challenging since it only takes 6.5 seconds to run on my 2Ghz dual core laptop. It is, however, significantly more challenging than the toy problem I discussed in my original fsolve post. So, let’s see how much we can optimise it using NAG and any other techniques that come to mind.

Before going on, I’d just like to point out that we intentionally set the seed of the random number generator with

RandStream.setDefaultStream (RandStream('mt19937ar','seed',1));

to ensure we are always operating with the same data set. This is purely for benchmarking purposes.

Optimisation trick 1 – replacing fsolve with NAG’s c05nb

The first thing I had to do to Biao’s code in order to apply the NAG toolbox was to split it into two files. We have the main program, fsolve_markowitz_NAG.m

function MarkowitzWeight = fsolve_markowitz_NAG()
global r targetR rCOV;
%seed random generator to ensure same results every time for benchmark purposes
RandStream.setDefaultStream (RandStream('mt19937ar','seed',1));

% simulate 500 stocks return
SimR = randn(1000,500);
r = mean(SimR)'; % use mean return as expected return
targetR = 0.02;
rCOV = cov(SimR); % covariance matrix
NumAsset = length(r);

startX = [1/NumAsset*ones(1,NumAsset), 1, 1];
x = c05nb('fsolve_obj_NAG',startX);
MarkowitzWeight = x(1:NumAsset);

and the objective function, fsolve_obj_NAG.m

function [F,iflag] = fsolve_obj_NAG(n,x,iflag)
global r targetR rCOV;
NumAsset = length(r);
F = zeros(1,NumAsset+2);
weight = x(1:NumAsset); % for asset weights
lambda = x(NumAsset+1);
mu = x(NumAsset+2);
for i = 1:NumAsset
    F(i) = weight(i)*sum(rCOV(i,:))-lambda*r(i)-mu;
F(NumAsset+1) = sum(weight.*r)-targetR;
F(NumAsset+2) = sum(weight)-1;

I had to put the objective function into its own .m file because because the NAG toolbox currently doesn’t accept function handles (This may change in a future release I am reliably told). This is a pain but not a show stopper.

Note also that the function header for the NAG version of the objective function is different from the pure MATLAB version as
per my original article.

Transposition problems
The next thing to notice is that I have done the following in the main program, fsolve_markowitz_NAG.m

r = mean(SimR)';

compared to the original

r = mean(SimR);

This is because the NAG routine, c05nb, does something rather bizarre. I discovered that if you pass a 1xN matrix to NAGs c05nb then it gets transposed in the objective function to Nx1. Since Biao relies on this being 1xN when he does

F(NumAsset+1) = sum(weight.*r)-targetR;

we get an error unless you take the transpose of r in the main program. I consider this to be a bug in the NAG Toolbox and it is currently with NAG technical support.

Global variables
Sadly, there is no easy way to pass extra variables to the objective function when using NAG’s c05nb. In MATLAB Biao could do this

x = fsolve(@fsolve_markowitz,startX,options, r, targetR, rCOV);

No such luck with c05nb. The only way to get around it is to use global variables (Well, a small lie, there is another way, you can use reverse communication in a different NAG function, c05nd, but that’s a bit more complicated and I’ll leave it for another time I think)

So, It’s pretty clear that the NAG function isn’t as easy to use as MATLAB’s fsolve function but is it any faster? The following is typical.

>> tic;fsolve_markowitz_NAG;toc
Elapsed time is 2.324291 seconds.

So, we have sped up our overall computation time by a factor of 3. Not bad, but nowhere near the 18 times speed-up that I demonstrated in my original post.

Optimisation trick 2 – Don’t sum so much

In his recent blog post, Biao challenged me to make his code almost 20 times faster and after applying my NAG Toolbox trick I had ‘only’ managed to make it 3 times faster. So, in order to see what else I might be able to do for him I ran his code through the MATLAB profiler and discovered that it spent the vast majority of its time in the following section of the objective function

for i = 1:NumAsset
    F(i) = weight(i)*sum(rCOV(i,:))-lambda*r(i)-mu;

It seemed to me that the original code was calculating the sum of rCOV too many times. On a small number of assets (50 say) you’ll probably not notice it but with 500 assets, as we have here, it’s very wasteful.

So, I changed the above to

for i = 1:NumAsset
    F(i) = weight(i)*sumrCOV(i)-lambda*r(i)-mu;

So,instead of rCOV, the objective function needs to be passed the following in the main program


We do this summation once and once only and make a big time saving.  This is done in the files fsolve_markowitz_NAGnew.m and fsolve_obj_NAGnew.m

With this optimisation, along with using NAG’s c05nb we now get the calculation time down to

>> tic;fsolve_markowitz_NAGnew;toc
Elapsed time is 0.203243 seconds.

Compared to the 6.5 seconds of the original code, this represents a speed-up of over 32 times. More than enough to satisfy Biao’s challenge.

Checking the answers

Let’s make sure that the results agree with each other. I don’t expect them to be exactly equal due to round off errors etc but they should be extremely close. I check that the difference between each result is less than 1e-7:

original_results = fsolve_markowitz_MATLAB;
nagver1_results = fsolve_markowitz_NAGnew;
>> all(abs(original_results - nagver1_results)<1e8)
ans =

That’ll do for me!


The current iteration of the NAG Toolbox for MATLAB (Mark 22) may not be as easy to use as native MATLAB toolbox functions (for this example at least) but it can often provide useful speed-ups and in this example, NAG gave us a factor of 3 improvement compared to using fsolve.

For this particular example, however, the more impressive speed-up came from pushing the code through the MATLAB profiler, identifying the hotspot and then doing something about it.  There are almost certainly more optimisations to make but with the code running at less than a quarter of a second on my modest little laptop I think we will leave it there for now.