Using NAG Toolbox to speed up MATLAB’s fsolve part 2

October 5th, 2010 | Categories: matlab, NAG Library, programming | Tags:

Not long after publishing my article, A faster version of MATLAB’s fsolve using the NAG Toolbox for MATLAB, I was contacted by Biao Guo of Quantitative Finance Collector who asked me what I could do with the following piece of code.

function MarkowitzWeight = fsolve_markowitz_MATLAB()

RandStream.setDefaultStream (RandStream('mt19937ar','seed',1));
% simulate 500 stocks return
SimR = randn(1000,500);
r = mean(SimR); % use mean return as expected return
targetR = 0.02;
rCOV = cov(SimR); % covariance matrix
NumAsset = length(r);

options=optimset('Display','off');
startX = [1/NumAsset*ones(1, NumAsset), 1, 1];
x = fsolve(@fsolve_markowitz,startX,options, r, targetR, rCOV);
MarkowitzWeight = x(1:NumAsset);
end

function F = fsolve_markowitz(x, r, targetR, rCOV)
NumAsset = length(r);
F = zeros(1,NumAsset+2);
weight = x(1:NumAsset); % for asset weights
lambda = x(NumAsset+1);
mu = x(NumAsset+2);
for i = 1:NumAsset
    F(i) = weight(i)*sum(rCOV(i,:))-lambda*r(i)-mu;
end
F(NumAsset+1) = sum(weight.*r)-targetR;
F(NumAsset+2) = sum(weight)-1;
end

This is an example from financial mathematics and Biao explains it in more detail in a post over at his blog, mathfinance.cn. Now, it isn’t particularly computationally challenging since it only takes 6.5 seconds to run on my 2Ghz dual core laptop. It is, however, significantly more challenging than the toy problem I discussed in my original fsolve post. So, let’s see how much we can optimise it using NAG and any other techniques that come to mind.

Before going on, I’d just like to point out that we intentionally set the seed of the random number generator with

RandStream.setDefaultStream (RandStream('mt19937ar','seed',1));

to ensure we are always operating with the same data set. This is purely for benchmarking purposes.

Optimisation trick 1 – replacing fsolve with NAG’s c05nb

The first thing I had to do to Biao’s code in order to apply the NAG toolbox was to split it into two files. We have the main program, fsolve_markowitz_NAG.m

function MarkowitzWeight = fsolve_markowitz_NAG()
global r targetR rCOV;
%seed random generator to ensure same results every time for benchmark purposes
RandStream.setDefaultStream (RandStream('mt19937ar','seed',1));

% simulate 500 stocks return
SimR = randn(1000,500);
r = mean(SimR)'; % use mean return as expected return
targetR = 0.02;
rCOV = cov(SimR); % covariance matrix
NumAsset = length(r);

startX = [1/NumAsset*ones(1,NumAsset), 1, 1];
tic
x = c05nb('fsolve_obj_NAG',startX);
toc
MarkowitzWeight = x(1:NumAsset);
end

and the objective function, fsolve_obj_NAG.m

function [F,iflag] = fsolve_obj_NAG(n,x,iflag)
global r targetR rCOV;
NumAsset = length(r);
F = zeros(1,NumAsset+2);
weight = x(1:NumAsset); % for asset weights
lambda = x(NumAsset+1);
mu = x(NumAsset+2);
for i = 1:NumAsset
    F(i) = weight(i)*sum(rCOV(i,:))-lambda*r(i)-mu;
end
F(NumAsset+1) = sum(weight.*r)-targetR;
F(NumAsset+2) = sum(weight)-1;
end

I had to put the objective function into its own .m file because because the NAG toolbox currently doesn’t accept function handles (This may change in a future release I am reliably told). This is a pain but not a show stopper.

Note also that the function header for the NAG version of the objective function is different from the pure MATLAB version as
per my original article.

Transposition problems
The next thing to notice is that I have done the following in the main program, fsolve_markowitz_NAG.m

r = mean(SimR)';

compared to the original

r = mean(SimR);

This is because the NAG routine, c05nb, does something rather bizarre. I discovered that if you pass a 1xN matrix to NAGs c05nb then it gets transposed in the objective function to Nx1. Since Biao relies on this being 1xN when he does

F(NumAsset+1) = sum(weight.*r)-targetR;

we get an error unless you take the transpose of r in the main program. I consider this to be a bug in the NAG Toolbox and it is currently with NAG technical support.

Global variables
Sadly, there is no easy way to pass extra variables to the objective function when using NAG’s c05nb. In MATLAB Biao could do this

x = fsolve(@fsolve_markowitz,startX,options, r, targetR, rCOV);

No such luck with c05nb. The only way to get around it is to use global variables (Well, a small lie, there is another way, you can use reverse communication in a different NAG function, c05nd, but that’s a bit more complicated and I’ll leave it for another time I think)

So, It’s pretty clear that the NAG function isn’t as easy to use as MATLAB’s fsolve function but is it any faster? The following is typical.

>> tic;fsolve_markowitz_NAG;toc
Elapsed time is 2.324291 seconds.

So, we have sped up our overall computation time by a factor of 3. Not bad, but nowhere near the 18 times speed-up that I demonstrated in my original post.

Optimisation trick 2 – Don’t sum so much

In his recent blog post, Biao challenged me to make his code almost 20 times faster and after applying my NAG Toolbox trick I had ‘only’ managed to make it 3 times faster. So, in order to see what else I might be able to do for him I ran his code through the MATLAB profiler and discovered that it spent the vast majority of its time in the following section of the objective function

for i = 1:NumAsset
    F(i) = weight(i)*sum(rCOV(i,:))-lambda*r(i)-mu;
end

It seemed to me that the original code was calculating the sum of rCOV too many times. On a small number of assets (50 say) you’ll probably not notice it but with 500 assets, as we have here, it’s very wasteful.

So, I changed the above to

for i = 1:NumAsset
    F(i) = weight(i)*sumrCOV(i)-lambda*r(i)-mu;
end

So,instead of rCOV, the objective function needs to be passed the following in the main program

sumrCOV=sum(rCOV)

We do this summation once and once only and make a big time saving.  This is done in the files fsolve_markowitz_NAGnew.m and fsolve_obj_NAGnew.m

With this optimisation, along with using NAG’s c05nb we now get the calculation time down to

>> tic;fsolve_markowitz_NAGnew;toc
Elapsed time is 0.203243 seconds.

Compared to the 6.5 seconds of the original code, this represents a speed-up of over 32 times. More than enough to satisfy Biao’s challenge.

Checking the answers

Let’s make sure that the results agree with each other. I don’t expect them to be exactly equal due to round off errors etc but they should be extremely close. I check that the difference between each result is less than 1e-7:

original_results = fsolve_markowitz_MATLAB;
nagver1_results = fsolve_markowitz_NAGnew;
>> all(abs(original_results - nagver1_results)<1e8)
ans =
     1

That’ll do for me!

Summary

The current iteration of the NAG Toolbox for MATLAB (Mark 22) may not be as easy to use as native MATLAB toolbox functions (for this example at least) but it can often provide useful speed-ups and in this example, NAG gave us a factor of 3 improvement compared to using fsolve.

For this particular example, however, the more impressive speed-up came from pushing the code through the MATLAB profiler, identifying the hotspot and then doing something about it.  There are almost certainly more optimisations to make but with the code running at less than a quarter of a second on my modest little laptop I think we will leave it there for now.

  1. October 6th, 2010 at 11:34
    Reply | Quote | #1

    Nice post Mike! I definitely agree with the importance of using the right algorithm before choosing a particular toolbox/language/editor.
    The original problem seems like one that can be solved using backslash directly, see below:

    function MarkowitzWeight = backslash_markowitz_MATLAB()

    RandStream.setDefaultStream (RandStream(‘mt19937ar’,’seed’,1));
    % simulate 500 stocks return
    SimR = randn(1000,500);
    r = mean(SimR); % use mean return as expected return
    targetR = 0.02;
    rCOV = cov(SimR); % covariance matrix
    NumAsset = length(r);

    A = [[rCOV, -r’, -ones(NumAsset,1)];…
    [r 0 0];…
    [ones(1,NumAsset) 0 0]];
    b = [zeros(NumAsset,1); targetR; 1];
    x = A\b;

    MarkowitzWeight = x(1:NumAsset)’;
    % Display results
    returnVariance = MarkowitzWeight*rCOV*MarkowitzWeight’;
    returnMean = MarkowitzWeight*r’;
    weightSum = sum(MarkowitzWeight);
    fprintf(‘Return variance: %f\nReturn mean: %f\nTotal weights: %f\n’,…
    returnVariance,returnMean,weightSum)
    end

    This seems to give much lower variance than the original fsolve version, 0.001329 vs 0.701577 for my runs, and runtime of 0.066192s vs 3.508711s (53x speedup).
    I’m probably missing something though, so it would be good to check that we’re solving the same problems :)

    Also, with no short-selling allowed (all weights between [0,1]) the problem can be solved using quadprog:

    options=optimset(‘Display’,’off’,’LargeScale’,’off’);
    x = quadprog(2*rCOV,zeros(NumAsset,1),-r,-targetR,…
    ones(1,NumAsset),1,…
    zeros(NumAsset,1),ones(NumAsset,1),…
    [],options);

    This gives a return variance of 0.001577, all weights positive, but takes 13s.

    Cheers,
    Matt

  2. October 6th, 2010 at 12:13
    Reply | Quote | #2

    Hi Matt

    Thanks for the comment and code. I’m very much a finance newbie so I’ll leave it for Biao (or anyone else who knows about it) to comment on whether or not your code solves the same problem as ours.

    The original point of this article was to simply give a beefier example for the fsolve vs NAG/c05nb functions and to show that you don’t always get the 18x speedup I demonstrated in my original post. So, we NEEDED to use fsolve ;)

    It has started to become, however, an education (for me) in Markowitz theory and how best to work with it. So, I’m grateful for your contribution.

    Cheers,
    Mike

  3. October 6th, 2010 at 13:35
    Reply | Quote | #3

    Hi Mike,
    Thanks again for the great post, it is a really good example of the things you need to consider when switching between different implementations, eg matrix shapes, passing in parameters, vectorisation etc. And the 3x speedup is still pretty good before optimising the algorithm itself!
    Just for interest, I’m in the middle of reading ‘Convex Optimization’ by Boyd and Vandenberghe, it’s available online at http://www.stanford.edu/~boyd/cvxbook/ and discusses Markowitz portfolio optimization on is section 4.4.1, p169. Randomly, this was the page I was up to when I saw your post.

    Cheers,
    Matt

  4. October 7th, 2010 at 19:49
    Reply | Quote | #4

    Hello again Matt

    As you already know from Biao’s blog, Biao and I made a mistake so your code gives the correct result. For all other readers, the line in the original code that said

    F(i) = weight(i)*sum(rCOV(i,:))-lambda*r(i)-mu;
    

    should have been

    F(i) = sum(weight.*rCOV(i,:))-lambda*r(i)-mu; 
    

    This doesn’t affect our fsolve/NAG comparison but it does make a mess of my ‘2nd optimisation trick’ Ho hum!

    This does demonstrate one interesting point though and that is ‘Blog about your code so that your readers can help you debug it’ :)

    I’m not going to modify the original code in order to ensure that the original post and the subsequent discussion make sense to the reader. I will, however, post a corrected version of the .zip file at some point over the next few days.

  5. Mark
    August 23rd, 2013 at 06:28
    Reply | Quote | #5

    Hi Mike. I’m having the same problem with my code. It runs alright, and produces expected results but it takes an awfully long time, around 2 minutes. I think, like you’ve explained here, it has to do with additions and subtractions in the function file cos it’s a 400 x 400 matrix it has to evaluate. I’d really want it to work a lot faster cos I’m building the code for more complex situations (600 x 600 matrix, with a lot more difficult functions). How can I reach you by email and have you take a look at it? I tried your firstname.lastname manchester address but it returned the mail. If you’d be kind enough to reach me, I’d appreciate it. Thanks!