## Archive for the ‘NAG Library’ Category

**HPC! To cloud or not to cloud….**

Over the course of my career I have been involved with the provision of High Performance Computing (HPC) at almost every level. As a researcher and research software engineer I have been, and continue to be, **a user** of many large scale systems. As a member of research computing support, I was involved in **service development and delivery, ****user-support, system administration** and **HPC training.** Finally, as a member of senior leadership teams, I have been involved in the **financial planning and strategic development** of institution-wide HPC services.

In recent years, the question that pops up at every level of involvement with HPC is *‘Should we do this with our own equipment or should we do this in the cloud?’ *

This is an extremely complex, multi-dimensional question where the parameters are constantly shifting. The short, answer is always** ‘well…it depends’ **and it always does ‘depend’…on many things! What do you want to do? What is the state of the software that you have available? How much money do you have? How much time do you have? What support do you have? What equipment do you currently have and who else are you working with and so on.

Today, I want to focus on just one of these factors. Cost. I’m not saying that cost is necessarily the most important consideration in making HPC-related decisions but given that most of us have fixed budgets, it is impossible to ignore.

**The value of independence**

I have been involved in many discussions of HPC costs over the years and have seen several attempts at making the cloud vs on-premise cost-comparison. Such attempts are often biased in one direction or the other. It’s difficult not to be when you have a vested interest in the outcome.

When I chose to leave academia and join industry, I decided to work with the Numerical Algorithms Group (NAG), a research computing company that has been in business for almost 50 years. One reason for choosing them was their values which strongly resonate with my own. Another was their independence. NAG are a not-for-profit (yet commercial!) company who literally cannot be bought. They solve the independence problem by collaborating with pretty much everybody in the industry.

So, NAG are more interested in helping clients solve their technical computing problems than they are in driving them to any given solution. Cloud or on-premise? They don’t mind…they just want to help you make the decision.

**Explore HPC cost models yourself**

NAG’s VP for HPC services and consulting, Andrew Jones (@hpcnotes), has been teaching seminars on Total Cost Of Ownership models for several years at events like the SC Supercomputing conference series. To support these tutorials, Andrew created a simple, generic TCO model spreadsheet to illustrate the concepts and to show how even a simple TCO model can guide decisions.

Many people asked if NAG could send them the TCO spreadsheet from the tutorial but they decided to go one better and converted it into a web-form where you can start exploring some of the concepts yourself.

- Here’s the HPC Total Cost of Ownership calculator for you to play with: https://www.nag.com/content/hpc-tco-calculator

I also made a video about it.

If you want more, get in touch with either me (@walkingrandomly) or Andrew (@hpcnotes) on twitter or email hpc@nag.com. We’ll also be teaching this type of thing at ISC 2019 so please do join us there too.

I recently wrote a blog post for my new employer, The Numerical Algorithms Group, called Exploiting Matrix Structure in the solution of linear systems. It’s a demonstration that shows how choosing the right specialist solver for your problem rather than using a general purpose one can lead to a speed up of well over 100 times! The example is written in Python but the NAG routines used can be called from a range of languages including C,C++, Fortran, MATLAB etc etc

Given a symmetric matrix such as

What’s the nearest correlation matrix? A 2002 paper by Manchester University’s Nick Higham which answered this question has turned out to be rather popular! At the time of writing, Google tells me that it’s been cited 394 times.

Last year, Nick wrote a blog post about the algorithm he used and included some MATLAB code. He also included links to applications of this algorithm and implementations of various NCM algorithms in languages such as MATLAB, R and SAS as well as details of the superb commercial implementation by The Numerical algorithms group.

I noticed that there was no Python implementation of Nick’s code so I ported it myself.

Here’s an example IPython session using the module

In [1]: from nearest_correlation import nearcorr In [2]: import numpy as np In [3]: A = np.array([[2, -1, 0, 0], ...: [-1, 2, -1, 0], ...: [0, -1, 2, -1], ...: [0, 0, -1, 2]]) In [4]: X = nearcorr(A) In [5]: X Out[5]: array([[ 1. , -0.8084125 , 0.1915875 , 0.10677505], [-0.8084125 , 1. , -0.65623269, 0.1915875 ], [ 0.1915875 , -0.65623269, 1. , -0.8084125 ], [ 0.10677505, 0.1915875 , -0.8084125 , 1. ]])

This module is in the early stages and there is a lot of work to be done. For example, I’d like to include a lot more examples in the test suite, add support for the commercial routines from NAG and implement other algorithms such as the one by Qi and Sun among other things.

Hopefully, however, it is just good enough to be useful to someone. Help yourself and let me know if there are any problems. Thanks to Vedran Sego for many useful comments and suggestions.

- github repository for the Python NCM module, nearest_correlation
- Nick Higham’s original MATLAB code.
- NAG’s commercial implementation – callable from C, Fortran, MATLAB, Python and more. A superb implementation that is significantly faster and more robust than this one!

I’ve been a user and supporter of the commercial numerical libraries from NAG for several years now, using them in MATLAB, Fortran, C and Python among others. They recently updated their C library to version 24 which now includes 1516 functions apparently.

NAG’s routines are fast, accurate, extremely well supported and often based on cutting edge numerical research (I know this because academics at my University, The University of Manchester, is responsible for some of said research). I often use functions from the NAG C library in MATLAB mex routines in order to help speed up researcher’s code.

Here’s some of the new functionality in Mark 24. A full list of functions is available at http://www.nag.co.uk/numeric/CL/nagdoc_cl24/html/FRONTMATTER/manconts.html

• Hypergeometric function (1f1 and 2f1)

• Nearest Correlation Matrix

• Elementwise weighted nearest correlation matrix

• Wavelet Transforms & FFTs

* —Three dimensional discrete single level and multi-level wavelet transforms*

*• Matrix Functions*

*—*Fast Fourier Transforms (FFTs) for two dimensional and three dimensional real data

*—*Matrix square roots and general powers*• Interpolation*

*—*Matrix exponentials (Schur-Parlett)*—*Fréchet Derivative*—*Calculation of condition numbers*— Interpolation for 5D and higher dimensions*

• Optimization

*—*Local optimization: Non-negative least squares*• Random Number Generatorss*

*—*Global optimization: Multi-start versions of general nonlinear programming and least squares routines*— Brownian bridge and random fields*

• Statistics

*— Gaussian mixture model*

*— Best subsets of given size (branch and bound )**— Vectorized probabilities and probability density functions of distributions**— Inhomogeneous time series analysis, moving averages**— Routines that combines two sums of squares matrices to allow large datasets to be summarised*• Data fitting

*—*Fit of 2D scattered data by two-stage approximation (suitable for large datasets)*• Quadrature*

*— 1D adaptive for badly-behaved integrals*

• Sparse eigenproblem

*— Driver for real general matrix, driver for banded complex eigenproblem*

*— Real and complex quadratic eigenvalue problems*• Sparse linear systems

*— block diagonal pre-conditioners and solvers*

• ODE solvers

*— Threadsafe initial value ODE solvers*

• Volatility

*— Heston model with term structure*

In a recent Stack Overflow query, someone asked if you could switch off the balancing step when calculating eigenvalues in Python. In the document A case where balancing is harmful, David S. Watkins describes the balancing step as* ‘the input matrix A is replaced by a rescaled matrix A* = D ^{-1}AD, where D is a diagonal matrix chosen so that, for each i, the ith row and the ith column of A* have roughly the same norm.’ *

Such balancing is usually very useful and so is performed by default by software such as MATLAB or Numpy. There are times, however, when one would like to switch it off.

In MATLAB, this is easy and the following is taken from the online MATLAB documentation

A = [ 3.0 -2.0 -0.9 2*eps; -2.0 4.0 1.0 -eps; -eps/4 eps/2 -1.0 0; -0.5 -0.5 0.1 1.0]; [VN,DN] = eig(A,'nobalance') VN = 0.6153 -0.4176 -0.0000 -0.1528 -0.7881 -0.3261 0 0.1345 -0.0000 -0.0000 -0.0000 -0.9781 0.0189 0.8481 -1.0000 0.0443 DN = 5.5616 0 0 0 0 1.4384 0 0 0 0 1.0000 0 0 0 0 -1.0000

At the time of writing, it is not possible to directly do this in Numpy (as far as I know at least). Numpy’s eig command currently uses the LAPACK routine DGEEV to do the heavy lifting for double precision matrices. We can see this by looking at the source code of numpy.linalg.eig where the relevant subsection is

lapack_routine = lapack_lite.dgeev wr = zeros((n,), t) wi = zeros((n,), t) vr = zeros((n, n), t) lwork = 1 work = zeros((lwork,), t) results = lapack_routine(_N, _V, n, a, n, wr, wi, dummy, 1, vr, n, work, -1, 0) lwork = int(work[0]) work = zeros((lwork,), t) results = lapack_routine(_N, _V, n, a, n, wr, wi, dummy, 1, vr, n, work, lwork, 0)

My plan was to figure out how to tell DGEEV not to perform the balancing step and I’d be done. Sadly, however, it turns out that this is not possible. Taking a look at the reference implementation of DGEEV, we can see that the balancing step is **always performed **and is not user controllable–here’s the relevant bit of Fortran

* Balance the matrix * (Workspace: need N) * IBAL = 1 CALL DGEBAL( 'B', N, A, LDA, ILO, IHI, WORK( IBAL ), IERR )

So, using DGEEV is a dead-end unless we are willing to modifiy and recompile the lapack source — something that’s rarely a good idea in my experience. There is another LAPACK routine that is of use, however, in the form of DGEEVX that allows us to control balancing. Unfortunately, this routine is not part of the **numpy.linalg.lapack_lite** interface provided by Numpy and I’ve yet to figure out how to add extra routines to it.

I’ve also discovered that this functionality is an open feature request in Numpy.

**Enter the NAG Library**

My University has a site license for the commercial Numerical Algorithms Group (NAG) library. Among other things, NAG offers an interface to all of LAPACK along with an interface to Python. So, I go through the installation and do

import numpy as np from ctypes import * from nag4py.util import Nag_RowMajor,Nag_NoBalancing,Nag_NotLeftVecs,Nag_RightVecs,Nag_RCondEigVecs,Integer,NagError,INIT_FAIL from nag4py.f08 import nag_dgeevx eps = np.spacing(1) np.set_printoptions(precision=4,suppress=True) def unbalanced_eig(A): """ Compute the eigenvalues and right eigenvectors of a square array using DGEEVX via the NAG library. Requires the NAG C library and NAG's Python wrappers http://www.nag.co.uk/python.asp The balancing step that's performed in DGEEV is not performed here. As such, this function is the same as the MATLAB command eig(A,'nobalance') Parameters ---------- A : (M, M) Numpy array A square array of real elements. On exit: A is overwritten and contains the real Schur form of the balanced version of the input matrix . Returns ------- w : (M,) ndarray The eigenvalues v : (M, M) ndarray The eigenvectors Author: Mike Croucher (www.walkingrandomly.com) Testing has been mimimal """ order = Nag_RowMajor balanc = Nag_NoBalancing jobvl = Nag_NotLeftVecs jobvr = Nag_RightVecs sense = Nag_RCondEigVecs n = A.shape[0] pda = n pdvl = 1 wr = np.zeros(n) wi = np.zeros(n) vl=np.zeros(1); pdvr = n vr = np.zeros(pdvr*n) ilo=c_long(0) ihi=c_long(0) scale = np.zeros(n) abnrm = c_double(0) rconde = np.zeros(n) rcondv = np.zeros(n) fail = NagError() INIT_FAIL(fail) nag_dgeevx(order,balanc,jobvl,jobvr,sense, n, A.ctypes.data_as(POINTER(c_double)), pda, wr.ctypes.data_as(POINTER(c_double)), wi.ctypes.data_as(POINTER(c_double)),vl.ctypes.data_as(POINTER(c_double)),pdvl, vr.ctypes.data_as(POINTER(c_double)),pdvr,ilo,ihi, scale.ctypes.data_as(POINTER(c_double)), abnrm, rconde.ctypes.data_as(POINTER(c_double)),rcondv.ctypes.data_as(POINTER(c_double)),fail) if all(wi == 0.0): w = wr v = vr.reshape(n,n) else: w = wr+1j*wi v = array(vr, w.dtype).reshape(n,n) return(w,v)

Define a test matrix:

A = np.array([[3.0,-2.0,-0.9,2*eps], [-2.0,4.0,1.0,-eps], [-eps/4,eps/2,-1.0,0], [-0.5,-0.5,0.1,1.0]])

Do the calculation

(w,v) = unbalanced_eig(A)

which gives

(array([ 5.5616, 1.4384, 1. , -1. ]), array([[ 0.6153, -0.4176, -0. , -0.1528], [-0.7881, -0.3261, 0. , 0.1345], [-0. , -0. , -0. , -0.9781], [ 0.0189, 0.8481, -1. , 0.0443]]))

This is exactly what you get by running the MATLAB command **eig(A,’nobalance’).**

Note that unbalanced_eig(A) **changes the input matri**x A to

array([[ 5.5616, -0.0662, 0.0571, 1.3399], [ 0. , 1.4384, 0.7017, -0.1561], [ 0. , 0. , 1. , -0.0132], [ 0. , 0. , 0. , -1. ]])

According to the NAG documentation, this is the real Schur form of the balanced version of the input matrix. I can’t see how to ask NAG to not do this. I guess that if it’s not what you want unbalanced_eig() to do, you’ll need to pass a copy of the input matrix to NAG.

**The IPython notebook**

The code for this article is available as an IPython Notebook

**The future**

This blog post was written using Numpy version 1.7.1. There is an enhancement request for the functionality discussed in this article open in Numpy’s git repo and so I expect this article to become redundant pretty soon.

The Numerical Algorithms Group (NAG) are principally known for their numerical library but they also offer products such as a MATLAB toolbox and a Fortran compiler. My employer, The University of Manchester, has a full site license for most of NAG’s stuff where it is heavily used by both our students and researchers.

While at a recent software conference, I saw a talk by NAG’s David Sayers where he demonstrated some of the features of the NAG Fortran Compiler. During this talk he showed some examples of broken Fortran and asked us if we could spot how they were broken without compiler assistance. I enjoyed the talk and so asked David if he would mind writing a guest blog post on the subject for WalkingRandomly. He duly obliged.

** This is a guest blog post by David Sayers of NAG.**

What do you want from your Fortran compiler? Some people ask for extra (non-standard) features, others require very fast execution speed. The very latest extensions to the Fortran language appeal to those who like to be up to date with their code.

I suspect that very few would put **enforcement of the Fortran standard** at the top of their list, yet this essential if problems are to be avoided in the future. Code written specifically for one compiler is unlikely to work when computers change, or may contain errors that appear only intermittently. Without access to at least one good checking compiler, the developer or support desk will be lacking a valuable tool in the fight against faulty code.

The NAG Fortran compiler is such a tool. It is used extensively by NAG’s own staff to validate their library code and to answer user-support queries involving user’s Fortran programs. It is available on Windows, where it has its own IDE called Fortran Builder, and on Unix platforms and Mac OS X.

Windows users also have the benefit of some Fortran Tools bundled in to the IDE. Particularly nice is the Fortran polisher which tidies up the presentation of your source files according to user-specified preferences.

The compiler includes most Fortran 2003 features, very many Fortran 2008 features and the most commonly used features of OpenMP 3.0 are supported.

The principal developer of the compiler is Malcolm Cohen, co-author of the book, Modern Fortran Explained along with Michael Metcalf and John Reid. Malcolm has been a member of the international working group on Fortran, ISO/IEC JTC1/SC22/WG5, since 1988, and the USA technical subcommittee on Fortran, J3, since 1994. He has been head of the J3 /DATA subgroup since 1998 and was responsible for the design and development of the object-oriented features in Fortran 2003. Since 2005 he has been Project Editor for the ISO/IEC Fortran standard, which has continued its evolution with the publication of the Fortran 2008 standard in 2010.

Of all people Malcolm Cohen should know Fortran and the way the standard should be enforced!

His compiler reflects that knowledge and is designed to assist the programmer to detect how programs might be faulty due to a departure from the Fortran standard or prone to trigger a run time error. In either case the diagnostics of produced by the compiler are clear and helpful and can save the developer many hours of laborious bug-tracing. Here are some particularly simple examples of faulty programs. See if you can spot the mistakes, and think how difficult these might be to detect in programs that may be thousands of times longer:

**Example 1**

Program test Real, Pointer :: x(:, :) Call make_dangle x(10, 10) = 0 Contains Subroutine make_dangle Real, Target :: y(100, 200) x => y End Subroutine make_dangle End Program test

**Example 2**

Program dangle2 Real,Pointer :: x(:),y(:) Allocate(x(100)) y => x Deallocate(x) y = 3 End

**Example 3**

program more integer n, i real r, s equivalence (n,r) i=3 r=2.5 i=n*n write(6,900) i, r 900 format(' i = ', i5, ' r = ', f10.4) stop 'ok' end

**Example 4**

program trouble1 integer n parameter (n=11) integer iarray(n) integer i do 10 i=1,10 iarray(i) = i 10 continue write(6,900) iarray 900 format(' iarray = ',11i5) stop 'ok' end

And finally if this is all too easy …

**Example 5**

! E04UCA Example Program Text ! Mark 23 Release. NAG Copyright 2011. MODULE e04ucae_mod ! E04UCA Example Program Module: ! Parameters and User-defined Routines ! .. Use Statements .. USE nag_library, ONLY : nag_wp ! .. Implicit None Statement .. IMPLICIT NONE ! .. Parameters .. REAL (KIND=nag_wp), PARAMETER :: one = 1.0_nag_wp REAL (KIND=nag_wp), PARAMETER :: zero = 0.0_nag_wp INTEGER, PARAMETER :: inc1 = 1, lcwsav = 1, & liwsav = 610, llwsav = 120, & lrwsav = 475, nin = 5, nout = 6 CONTAINS SUBROUTINE objfun(mode,n,x,objf,objgrd,nstate,iuser,ruser) ! Routine to evaluate objective function and its 1st derivatives. ! .. Implicit None Statement .. IMPLICIT NONE ! .. Scalar Arguments .. REAL (KIND=nag_wp), INTENT (OUT) :: objf INTEGER, INTENT (INOUT) :: mode INTEGER, INTENT (IN) :: n, nstate ! .. Array Arguments .. REAL (KIND=nag_wp), INTENT (INOUT) :: objgrd(n), ruser(*) REAL (KIND=nag_wp), INTENT (IN) :: x(n) INTEGER, INTENT (INOUT) :: iuser(*) ! .. Executable Statements .. IF (mode==0 .OR. mode==2) THEN objf = x(1)*x(4)*(x(1)+x(2)+x(3)) + x(3) END IF IF (mode==1 .OR. mode==2) THEN objgrd(1) = x(4)*(x(1)+x(1)+x(2)+x(3)) objgrd(2) = x(1)*x(4) objgrd(3) = x(1)*x(4) + one objgrd(4) = x(1)*(x(1)+x(2)+x(3)) END IF RETURN END SUBROUTINE objfun SUBROUTINE confun(mode,ncnln,n,ldcj,needc,x,c,cjac,nstate,iuser,ruser) ! Routine to evaluate the nonlinear constraints and their 1st ! derivatives. ! .. Implicit None Statement .. IMPLICIT NONE ! .. Scalar Arguments .. INTEGER, INTENT (IN) :: ldcj, n, ncnln, nstate INTEGER, INTENT (INOUT) :: mode ! .. Array Arguments .. REAL (KIND=nag_wp), INTENT (OUT) :: c(ncnln) REAL (KIND=nag_wp), INTENT (INOUT) :: cjac(ldcj,n), ruser(*) REAL (KIND=nag_wp), INTENT (IN) :: x(n) INTEGER, INTENT (INOUT) :: iuser(*) INTEGER, INTENT (IN) :: needc(ncnln) ! .. Executable Statements .. IF (nstate==1) THEN ! First call to CONFUN. Set all Jacobian elements to zero. ! Note that this will only work when 'Derivative Level = 3' ! (the default; see Section 11.2). cjac(1:ncnln,1:n) = zero END IF IF (needc(1)>0) THEN IF (mode==0 .OR. mode==2) THEN c(1) = x(1)**2 + x(2)**2 + x(3)**2 + x(4)**2 END IF IF (mode==1 .OR. mode==2) THEN cjac(1,1) = x(1) + x(1) cjac(1,2) = x(2) + x(2) cjac(1,3) = x(3) + x(3) cjac(1,4) = x(4) + x(4) END IF END IF IF (needc(2)>0) THEN IF (mode==0 .OR. mode==2) THEN c(2) = x(1)*x(2)*x(3)*x(4) END IF IF (mode==1 .OR. mode==2) THEN cjac(2,1) = x(2)*x(3)*x(4) cjac(2,2) = x(1)*x(3)*x(4) cjac(2,3) = x(1)*x(2)*x(4) cjac(2,4) = x(1)*x(2)*x(3) END IF END IF RETURN END SUBROUTINE confun END MODULE e04ucae_mod PROGRAM e04ucae ! E04UCA Example Main Program ! .. Use Statements .. USE nag_library, ONLY : dgemv, e04uca, e04wbf, nag_wp USE e04ucae_mod, ONLY : confun, inc1, lcwsav, liwsav, llwsav, lrwsav, & nin, nout, objfun, one, zero ! .. Implicit None Statement .. ! IMPLICIT NONE ! .. Local Scalars .. ! REAL (KIND=nag_wp) :: objf INTEGER :: i, ifail, iter, j, lda, ldcj, & ldr, liwork, lwork, n, nclin, & ncnln, sda, sdcjac ! .. Local Arrays .. REAL (KIND=nag_wp), ALLOCATABLE :: a(:,:), bl(:), bu(:), c(:), & cjac(:,:), clamda(:), objgrd(:), & r(:,:), work(:), x(:) REAL (KIND=nag_wp) :: ruser(1), rwsav(lrwsav) INTEGER, ALLOCATABLE :: istate(:), iwork(:) INTEGER :: iuser(1), iwsav(liwsav) LOGICAL :: lwsav(llwsav) CHARACTER (80) :: cwsav(lcwsav) ! .. Intrinsic Functions .. INTRINSIC max ! .. Executable Statements .. WRITE (nout,*) 'E04UCA Example Program Results' ! Skip heading in data file READ (nin,*) READ (nin,*) n, nclin, ncnln liwork = 3*n + nclin + 2*ncnln lda = max(1,nclin) IF (nclin>0) THEN sda = n ELSE sda = 1 END IF ldcj = max(1,ncnln) IF (ncnln>0) THEN sdcjac = n ELSE sdcjac = 1 END IF ldr = n IF (ncnln==0 .AND. nclin>0) THEN lwork = 2*n**2 + 20*n + 11*nclin ELSE IF (ncnln>0 .AND. nclin>=0) THEN lwork = 2*n**2 + n*nclin + 2*n*ncnln + 20*n + 11*nclin + 21*ncnln ELSE lwork = 20*n END IF ALLOCATE (istate(n+nclin+ncnln),iwork(liwork),a(lda,sda), & bl(n+nclin+ncnln),bu(n+nclin+ncnln),c(max(1, & ncnln)),cjac(ldcj,sdcjac),clamda(n+nclin+ncnln),objgrd(n),r(ldr,n), & x(n),work(lwork)) IF (nclin>0) THEN READ (nin,*) (a(i,1:sda),i=1,nclin) END IF READ (nin,*) bl(1:(n+nclin+ncnln)) READ (nin,*) bu(1:(n+nclin+ncnln)) READ (nin,*) x(1:n) ! Initialise E04UCA ifail = 0 CALL e04wbf('E04UCA',cwsav,lcwsav,lwsav,llwsav,iwsav,liwsav,rwsav, & lrwsav,ifail) ! Solve the problem ifail = -1 CALL e04uca(n,nclin,ncnln,lda,ldcj,ldr,a,bl,bu,confun,objfun,iter, & istate,c,cjac,clamda,objf,objgrd,r,x,iwork,liwork,work,lwork,iuser, & ruser,lwsav,iwsav,rwsav,ifail) SELECT CASE (ifail) CASE (0:6,8) WRITE (nout,*) WRITE (nout,99999) WRITE (nout,*) DO i = 1, n WRITE (nout,99998) i, istate(i), x(i), clamda(i) END DO IF (nclin>0) THEN ! A*x --> work. ! The NAG name equivalent of dgemv is f06paf CALL dgemv('N',nclin,n,one,a,lda,x,inc1,zero,work,inc1) WRITE (nout,*) WRITE (nout,*) WRITE (nout,99997) WRITE (nout,*) DO i = n + 1, n + nclin j = i - n WRITE (nout,99996) j, istate(i), work(j), clamda(i) END DO END IF IF (ncnln>0) THEN WRITE (nout,*) WRITE (nout,*) WRITE (nout,99995) WRITE (nout,*) DO i = n + nclin + 1, n + nclin + ncnln j = i - n - nclin WRITE (nout,99994) j, istate(i), c(j), clamda(i) END DO END IF WRITE (nout,*) WRITE (nout,*) WRITE (nout,99993) objf END SELECT 99999 FORMAT (1X,'Varbl',2X,'Istate',3X,'Value',9X,'Lagr Mult') 99998 FORMAT (1X,'V',2(1X,I3),4X,1P,G14.6,2X,1P,G12.4) 99997 FORMAT (1X,'L Con',2X,'Istate',3X,'Value',9X,'Lagr Mult') 99996 FORMAT (1X,'L',2(1X,I3),4X,1P,G14.6,2X,1P,G12.4) 99995 FORMAT (1X,'N Con',2X,'Istate',3X,'Value',9X,'Lagr Mult') 99994 FORMAT (1X,'N',2(1X,I3),4X,1P,G14.6,2X,1P,G12.4) 99993 FORMAT (1X,'Final objective value = ',1P,G15.7) END PROGRAM e04ucae

Answers to this particular New Year quiz will be posted in a future blog post.

A MATLAB user at Manchester University contacted me recently asking about Black-Scholes option pricing. The MATLAB Financial Toolbox has a range of functions that can calculate Black-Scholes put and call option prices along with several of the sensitivities (or ‘greeks‘) such as blsprice, blsdelta and so on.

The user’s problem is that we don’t have any site-wide licenses for the Financial Toolbox. We do, however, have a full site license for the NAG Toolbox for MATLAB which has a nice set of option pricing routines. Even though they calculate the same things, NAG Toolbox option pricing functions look very different to the Financial Toolbox ones and so I felt that a Rosetta Stone type article might be useful.

For Black-Scholes option pricing, there are three main differences between the two systems:

- The Financial Toolbox has separate functions for calculating the option price and each greek (e.g. blsprice, blsgamma, blsdelta etc) whereas NAG calculates the price and all greeks simultaneously with a single function call.
- Where appropriate, The MATLAB functions calculate Put and Call values with one function call whereas with NAG you need to explicitly specify Call or Put.
- NAG calculates more greeks than MATLAB.

The following code example pretty much says it all. Any variable calculated with the NAG Toolbox is prefixed NAG_ whereas anything calculated with the financial toolbox is prefixed MW_. When I developed this, I was using MATLAB 2012a with NAG Toolbox Mark 22.

%Input parameters for both NAG and MATLAB. Price=50; Strike=40; Rate=0.1; Time=0.25; Volatility=0.3; Yield=0; %calculate all greeks for a put using NAG [NAG_Put, NAG_PutDelta, NAG_Gamma, NAG_Vega, NAG_PutTheta, NAG_PutRho, NAG_PutCrho, NAG_PutVanna,... NAG_PutCharm, NAG_PutSpeed, NAG_PutColour, NAG_PutZomma,NAG_PutVomma, ifail] =... s30ab('p', Strike, Price, Time, Volatility, Rate, Yield); %calculate all greeks for a Call using NAG [NAG_Call, NAG_CallDelta, NAG_Gamma, NAG_Vega, NAG_CallTheta, NAG_CallRho, NAG_CallCrho, NAG_CallVanna,... NAG_CallCharm, NAG_CallSpeed, NAG_CallColour,NAG_CallZomma, NAG_CallVomma, ifail] = ... s30ab('c', Strike, Price, Time, Volatility, Rate, Yield); %Calculate the Elasticity (Lambda) NAG_CallLambda = Price/NAG_Call*NAG_CallDelta; NAG_PutLambda = Price/NAG_Put*NAG_PutDelta; %Calculate the same set of prices and greeks using the MATLAB Finance Toolbox [MW_Call, MW_Put] = blsprice(Price, Strike, Rate, Time, Volatility, Yield); [MW_CallDelta, MW_PutDelta] = blsdelta(Price, Strike, Rate, Time, Volatility, Yield); MW_Gamma = blsgamma(Price, Strike, Rate, Time, Volatility, Yield); MW_Vega = blsvega(Price, Strike, Rate, Time, Volatility, Yield); [MW_CallTheta, MW_PutTheta] = blstheta(Price, Strike, Rate, Time,Volatility, Yield); [MW_CallRho, MW_PutRho]= blsrho(Price, Strike, Rate, Time, Volatility,Yield); [MW_CallLambda,MW_PutLambda]=blslambda(Price, Strike, Rate, Time, Volatility,Yield);

Note that NAG doesn’t output the elasticity (Lambda) directly but it is trivial to obtain it using values that it does output. Also note that as far as I can tell, NAG outputs more Greeks than the Financial Toolbox does.

I’m not going to show the entire output of the above program because there are a lot of numbers. However, here are the Put values as calculated by NAG shown to 4 decimal places. I have checked and they agree with the Financial Toolbox to within numerical noise.

NAG_Put =0.1350 NAG_PutDelta =-0.0419 NAG_PutLambda =-15.5066 NAG_Gamma =0.0119 NAG_Vega =2.2361 NAG_PutTheta =-1.1187 NAG_PutRho =-0.5572 NAG_PutCrho = -0.5235 NAG_PutVanna =-0.4709 NAG_PutCharm =0.2229 NAG_PutSpeed =-0.0030 NAG_PutColour =-0.0275 NAG_PutZomma =0.0688 NAG_PutVomma =20.3560

I recently installed MATLAB 2012a on a Windows machine along with a certain set of standard Mathworks toolboxes. In addition, I also installed the excellent NAG Toolbox for MATLAB which is standard practice at my University.

I later realised that I had not installed all of the Mathworks toolboxes I needed so I fired up the MATLAB installer again and asked it to add the missing toolboxes. This extra installation never completed, however, and gave me the error message

The application encountered an unexpected error and needed to close. You may want to try re-installing your product(s). More infomation can be found at C:\path_to_a_log_file

I took a look at the log file mentioned which revealed a huge java error that began with

java.util.concurrent.ExecutionException: java.lang.StringIndexOutOfBoundsException: String index out of range: -2 at java.util.concurrent.FutureTask$Sync.innerGet(Unknown Source) at java.util.concurrent.FutureTask.get(Unknown Source) at javax.swing.SwingWorker.get(Unknown Source) at com.mathworks.wizard.worker.WorkerImpl.done(WorkerImpl.java:33) at javax.swing.SwingWorker$5.run(Unknown Source)

A little mucking around revealed that the installer was unhappy with the pathdef.m file at **C:\Program Files\MATLAB\R2012a\toolbox\local\pathdef.m **

The installer for the NAG Toolbox modifies this file by adding the line

'C:\Program Files\MATLAB\R2012a\toolbox\NAG\mex.w64;' ...

near the beginning and the lines

'C:\Program Files\MATLAB\R2012a\help\toolbox\NAG;' ... 'C:\Program Files\MATLAB\R2012a\help\toolbox\NAGToolboxDemos;' ...

Near the end and it seems that the MATLAB installer really doesn’t like this. So, what you do is create a copy of this pathdef.m file (pathdef.m.old for example) and then remove the non-mathworks lines in pathdef.m. Now you can install the extra Mathworks toolboxes you want. Once the installer has finished its work you can re-add the non-mathworks lines back into pathdef.m using your copy as a guide.

I’ll be informing both NAG and The Mathworks about this particular issue but wanted to get this post out there as soon as possible to provide a workaround since at least one other person has hit this problem at my University and I doubt that he will be the last (It’s also going to make SCCM deployment of MATLAB a pain but that’s another story).

**Update**

The Mathworks technical support have sent me a better workaround than the one detailed above. What you need to do is to change

'C:\Program Files\MATLAB\R2012a\toolbox\NAG\mex.w64;' ...

to

'C:\Program Files\MATLAB\R2012a\toolbox\NAG\mex.w64;', ...

The Mathworks installer is unhappy about the missing comma.

The NAG C Library is one of the largest commercial collections of numerical software currently available and I often find it very useful when writing MATLAB mex files. “*Why is that?*” I hear you ask.

One of the main reasons for writing a mex file is to gain more speed over native MATLAB. However, one of the main problems with writing mex files is that you have to do it in a low level language such as Fortran or C and so you lose much of the ease of use of MATLAB.

In particular, you lose straightforward access to most of the massive collections of MATLAB routines that you take for granted. Technically speaking that’s a lie because you could use the mex function **mexCallMATLAB** to call a MATLAB routine from within your mex file but then you’ll be paying a time overhead every time you go in and out of the mex interface. Since you are going down the mex route in order to gain speed, this doesn’t seem like the best idea in the world. This is also the reason why you’d use the NAG C Library and not the NAG Toolbox for MATLAB when writing mex functions.

One way out that I use often is to take advantage of the NAG C library and it turns out that it is extremely easy to add the NAG C library to your mex projects on Windows. Let’s look at a trivial example. The following code, nag_normcdf.c, uses the NAG function **nag_cumul_normal **to produce a simplified version of MATLAB’s normcdf function (laziness is all that prevented me from implementing a full replacement).

/* A simplified version of normcdf that uses the NAG C library * Written to demonstrate how to compile MATLAB mex files that use the NAG C Library * Only returns a normcdf where mu=0 and sigma=1 * October 2011 Michael Croucher * www.walkingrandomly.com */ #include <math.h> #include "mex.h" #include "nag.h" #include "nags.h" void mexFunction( int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[] ) { double *in,*out; int rows,cols,num_elements,i; if(nrhs>1) { mexErrMsgIdAndTxt("NAG:BadArgs","This simplified version of normcdf only takes 1 input argument"); } /*Get pointers to input matrix*/ in = mxGetPr(prhs[0]); /*Get rows and columns of input matrix*/ rows = mxGetM(prhs[0]); cols = mxGetN(prhs[0]); num_elements = rows*cols; /* Create output matrix */ plhs[0] = mxCreateDoubleMatrix(rows, cols, mxREAL); /* Assign pointer to the output */ out = mxGetPr(plhs[0]); for(i=0; i<num_elements; i++){ out[i] = nag_cumul_normal(in[i]); } }

To compile this in MATLAB, just use the following command

mex nag_normcdf.c CLW6I09DA_nag.lib

If your system is set up the same as mine then the above should ‘just work’ (see **System Information** at the bottom of this post). The new function works just as you would expect it to

>> format long >> format compact >> nag_normcdf(1) ans = 0.841344746068543

Compare the result to normcdf from the statistics toolbox

>> normcdf(1) ans = 0.841344746068543

So far so good. I could stop the post here since all I really wanted to do was say **‘The NAG C library is useful for MATLAB mex functions and it’s a doddle to use – here’s a toy example and here’s the mex command to compile it’**

However, out of curiosity, I looked to see if my toy version of normcdf was any faster than The Mathworks’ version. Let there be 10 million numbers:

>> x=rand(1,10000000);

Let’s time how long it takes MATLAB to take the normcdf of those numbers

>> tic;y=normcdf(x);toc Elapsed time is 0.445883 seconds. >> tic;y=normcdf(x);toc Elapsed time is 0.405764 seconds. >> tic;y=normcdf(x);toc Elapsed time is 0.366708 seconds. >> tic;y=normcdf(x);toc Elapsed time is 0.409375 seconds.

Now let’s look at my toy-version that uses NAG.

>> tic;y=nag_normcdf(x);toc Elapsed time is 0.544642 seconds. >> tic;y=nag_normcdf(x);toc Elapsed time is 0.556883 seconds. >> tic;y=nag_normcdf(x);toc Elapsed time is 0.553920 seconds. >> tic;y=nag_normcdf(x);toc Elapsed time is 0.540510 seconds.

So my version is slower! Never mind, I’ll just make my version parallel using OpenMP – Here is the code: nag_normcdf_openmp.c

/* A simplified version of normcdf that uses the NAG C library * Written to demonstrate how to compile MATLAB mex files that use the NAG C Library * Only returns a normcdf where mu=0 and sigma=1 * October 2011 Michael Croucher * www.walkingrandomly.com */ #include <math.h> #include "mex.h" #include "nag.h" #include "nags.h" #include <omp.h> void do_calculation(double in[],double out[],int num_elements) { int i,tid; #pragma omp parallel for shared(in,out,num_elements) private(i,tid) for(i=0; i<num_elements; i++){ out[i] = nag_cumul_normal(in[i]); } } void mexFunction( int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[] ) { double *in,*out; int rows,cols,num_elements; if(nrhs>1) { mexErrMsgIdAndTxt("NAG_NORMCDF:BadArgs","This simplified version of normcdf only takes 1 input argument"); } /*Get pointers to input matrix*/ in = mxGetPr(prhs[0]); /*Get rows and columns of input matrix*/ rows = mxGetM(prhs[0]); cols = mxGetN(prhs[0]); num_elements = rows*cols; /* Create output matrix */ plhs[0] = mxCreateDoubleMatrix(rows, cols, mxREAL); /* Assign pointer to the output */ out = mxGetPr(plhs[0]); do_calculation(in,out,num_elements); }

Compile that with

mex COMPFLAGS="$COMPFLAGS /openmp" nag_normcdf_openmp.c CLW6I09DA_nag.lib

and on my quad-core machine I get the following timings

>> tic;y=nag_normcdf_openmp(x);toc Elapsed time is 0.237925 seconds. >> tic;y=nag_normcdf_openmp(x);toc Elapsed time is 0.197531 seconds. >> tic;y=nag_normcdf_openmp(x);toc Elapsed time is 0.206511 seconds. >> tic;y=nag_normcdf_openmp(x);toc Elapsed time is 0.211416 seconds.

This is faster than MATLAB and so normal service is resumed :)

**System Information**

- 64bit Windows 7
- MATLAB 2011b
- NAG C Library Mark 9 – CLW6I09DAL
- Visual Studio 2008
- Intel Core i7-2630QM processor

MATLAB’s lsqcurvefit function is a very useful piece of code that will help you solve non-linear least squares curve fitting problems and it is used a lot by researchers at my workplace, The University of Manchester. As far as we are concerned it has two problems:

- lsqcurvefit is part of the Optimisation toolbox and, since we only have a limited number of licenses for that toolbox, the function is sometimes inaccessible. When we run out of licenses on campus users get the following error message

??? License checkout failed. License Manager Error -4 Maximum number of users for Optimization_Toolbox reached. Try again later. To see a list of current users use the lmstat utility or contact your License Administrator.

- lsqcurvefit is written as an .m file and so isn’t as fast as it could be.

One solution to these problems is to switch to the NAG Toolbox for MATLAB. Since we have a full site license for this product, we never run out of licenses and, since it is written in compiled Fortran, it is sometimes a lot faster than MATLAB itself. However, the current version of the NAG Toolbox (Mark 22 at the time of writing) isn’t without its issues either:

- There is no direct equivalent to lsqcurvefit in the NAG toolbox. You have to use NAG’s equivalent of lsqnonlin instead (which NAG call e04fy)
- The current version of the NAG Toolbox, Mark 22, doesn’t support function handles.
- The NAG toolbox requires the use of the datatypes
**int32**and**int64**depending on the architecture of your machine. The practical upshot of this is that your MATLAB code is suddenly a lot less portable. If you develop on a 64 bit machine then it will need modifying to run on a 32 bit machine and vice-versa.

While working on optimising someone’s code a few months ago I put together a couple of .m files in an attempt to address these issues. My intent was to improve the functionality of these files and eventually publish them but I never seemed to find the time. However, they have turned out to be a minor hit and I’ve sent them out to researcher after researcher with the caveat **“These are a first draft, I need to tidy them up sometime”** only to get the reply “Thanks for that, I no longer have any license problems and my code is executing more quickly.” They may be simple but it seems that they are useful.

So, until I find the time to add more functionality, here are my little wrappers that allow you to do non linear least squares fitting using the NAG Toolbox for MATLAB. They are about as minimal as you can get but they work and have proven to be useful time and time again. They also solve all 3 of the NAG issues mentioned above.

To use them just put them **both** somewhere on your MATLAB path. The syntax is identical to lsqcurvefit but this first version of my wrapper doesn’t do anything more complicated than the following

MATLAB:

[x,resnorm] = lsqcurvefit(@myfun,x0,xdata,ydata);

NAG with my wrapper:

[x,resnorm] = nag_lsqcurvefit(@myfun,x0,xdata,ydata);

I may add extra functionality in the future but it depends upon demand.

** **

**Performance**

Let’s look at the example given on the MATLAB help page for lsqcurvefit and compare it to the NAG version. First create a file called myfun.m as follows

function F = myfun(x,xdata) F = x(1)*exp(x(2)*xdata);

Now create the data and call the MATLAB fitting function

% Assume you determined xdata and ydata experimentally xdata = [0.9 1.5 13.8 19.8 24.1 28.2 35.2 60.3 74.6 81.3]; ydata = [455.2 428.6 124.1 67.3 43.2 28.1 13.1 -0.4 -1.3 -1.5]; x0 = [100; -1] % Starting guess [x,resnorm] = lsqcurvefit(@myfun,x0,xdata,ydata);

On my system I get the following timing for MATLAB (typical over around 10 runs)

tic;[x,resnorm] = lsqcurvefit(@myfun,x0,xdata,ydata); toc Elapsed time is 0.075685 seconds.

with the following results

x = 1.0e+02 * 4.988308584891165 -0.001012568612537 >> resnorm resnorm = 9.504886892389219

and for NAG using my wrapper function I get the following timing (typical over around 10 runs)

tic;[x,resnorm] = nag_lsqcurvefit(@myfun,x0,xdata,ydata); toc Elapsed time is 0.008163 seconds.

So, for this example, **NAG is around 9 times faster!** The results agree with MATLAB to several decimal places

x = 1.0e+02 * 4.988308605396028 -0.001012568632465 >> resnorm resnorm = 9.504886892366873

In the real world I find that the relative timings vary enormously and have seen speed-ups that range from a factor of 14 down to none at all. Whenever I am optimisng MATLAB code and see a lsqcurvefit function I always give the NAG version a quick try and am often impressed with the results.

My system specs if you want to compare results: 64bit Linux on a Intel Core 2 Quad Q9650 @ 3.00GHz running MATLAB 2010b and NAG Toolbox version MBL6A22DJL

**More articles about NAG
**