I found these links a while ago and forgot to post them here. Some interesting insights.

A lot of people don’t seem to know this….and they should. When working with floating point arithmetic, it is not necessarily true that a+(b+c) = (a+b)+c. Here is a demo using MATLAB

>> x=0.1+(0.2+0.3); >> y=(0.1+0.2)+0.3; >> % are they equal? >> x==y ans = 0 >> % lets look >> sprintf('%.17f',x) ans = 0.59999999999999998 >> sprintf('%.17f',y) ans = 0.60000000000000009

These results have nothing to do with the fact that I am using MATLAB. Here’s the same thing in Python

>>> x=(0.1+0.2)+0.3 >>> y=0.1+(0.2+0.3) >>> x==y False >>> print('%.17f' %x) 0.60000000000000009 >>> print('%.17f' %y) 0.59999999999999998

If this upsets you, or if you don’t understand why, I suggest you read the following

Does anyone else out there have suggestions for similar resources on this topic?

Consider the following code

function testpow() %function to compare integer powers with repeated multiplication rng default a=rand(1,10000000)*10; disp('speed of ^4 using pow') tic;test4p = a.^4;toc disp('speed of ^4 using multiply') tic;test4m = a.*a.*a.*a;toc disp('maximum difference in results'); max_diff = max(abs(test4p-test4m)) end

On running it on my late 2013 Macbook Air with a Haswell Intel i5 using MATLAB 2013b, I get the following results

speed of ^4 using pow Elapsed time is 0.527485 seconds. speed of ^4 using multiply Elapsed time is 0.025474 seconds. maximum difference in results max_diff = 1.8190e-12

In this case (derived from a real-world case), using repeated multiplication is around twenty times faster than using integer powers in MATLAB. This leads to some questions:-

- Why the huge speed difference?
- Would a similar speed difference be seen in other systems–R, Python, Julia etc?
- Would we see the same speed difference on other operating systems/CPUs?
- Are there any numerical reasons why using repeated multiplication instead of integer powers is a bad idea?

I occasionally get emails from researchers saying something like this

*‘My MATLAB code takes a week to run and the cleaner/cat/my husband keeps switching off my machine before it’s completed — could you help me make the code go faster please so that I can get my results in between these events’*

While I am more than happy to try to optimise the code in question, what these users really need is some sort of checkpointing scheme. Checkpointing is also important for users of high performance computing systems that limit the length of each individual job.

**The solution – Checkpointing (or ‘Assume that your job will frequently be killed’)**

The basic idea behind checkpointing is to periodically save your program’s state so that, if it is interrupted, it can start again where it left off rather than from the beginning. In order to demonstrate some of the principals involved, I’m going to need some code that’s sufficiently simple that it doesn’t cloud what I want to discuss. Let’s add up some numbers using a for-loop.

%addup.m %This is not the recommended way to sum integers in MATLAB -- we only use it here to keep things simple %This version does NOT use checkpointing mysum=0; for count=1:100 mysum = mysum + count; pause(1); %Let's pretend that this is a complicated calculation fprintf('Completed iteration %d \n',count); end fprintf('The sum is %f \n',mysum);

*Using a for-loop to perform an addition like this is something that I’d never usually suggest in MATLAB* but I’m using it here because it is so simple that it won’t get in the way of understanding the checkpointing code.

If you run this program in MATLAB, it will take about 100 seconds thanks to that pause statement which is acting as a proxy for some real work. Try interrupting it by pressing CTRL-C and then restart it. As you might expect, it will always start from the beginning:

>> addup Completed iteration 1 Completed iteration 2 Completed iteration 3 Operation terminated by user during addup (line 6) >> addup Completed iteration 1 Completed iteration 2 Completed iteration 3 Operation terminated by user during addup (line 6)

This is no big deal when your calculation only takes 100 seconds but is going to be a major problem when the calculation represented by that pause statement becomes something like an hour rather than a second.

Let’s now look at a version of the above that makes use of checkpointing.

%addup_checkpoint.m if exist( 'checkpoint.mat','file' ) % If a checkpoint file exists, load it fprintf('Checkpoint file found - Loading\n'); load('checkpoint.mat') else %otherwise, start from the beginning fprintf('No checkpoint file found - starting from beginning\n'); mysum=0; countmin=1; end for count = countmin:100 mysum = mysum + count; pause(1); %Let's pretend that this is a complicated calculation %save checkpoint countmin = count+1; %If we load this checkpoint, we want to start on the next iteration fprintf('Saving checkpoint\n'); save('checkpoint.mat'); fprintf('Completed iteration %d \n',count); end fprintf('The sum is %f \n',mysum);

Before you run the above code, the checkpoint file **checkpoint.mat** does not exist and so the calculation starts from the beginning. After every iteration, a checkpoint file is created which contains every variable in the MATLAB workspace. If the program is restarted, it will find the checkpoint file and continue where it left off. Our code now deals with interruptions a lot more gracefully.

>> addup_checkpoint No checkpoint file found - starting from beginning Saving checkpoint Completed iteration 1 Saving checkpoint Completed iteration 2 Saving checkpoint Completed iteration 3 Operation terminated by user during addup_checkpoint (line 16) >> addup_checkpoint Checkpoint file found - Loading Saving checkpoint Completed iteration 4 Saving checkpoint Completed iteration 5 Saving checkpoint Completed iteration 6 Operation terminated by user during addup_checkpoint (line 16)

Note that we’ve had to change the program logic slightly. Our original loop counter was

for count = 1:100

In the check-pointed example, however, we’ve had to introduce the variable **countmin**

for count = countmin:100

This allows us to start the loop from whatever value of countmin was in our last checkpoint file. Such minor modifications are often necessary when converting code to use checkpointing and you should carefully check that the introduction of checkpointing does not introduce bugs in your code.

**Don’t checkpoint too often**

The creation of even a small checkpoint file is a time consuming process. Consider our original addup code but without the pause command.

%addup_nopause.m %This version does NOT use checkpointing mysum=0; for count=1:100 mysum = mysum + count; fprintf('Completed iteration %d \n',count); end fprintf('The sum is %f \n',mysum);

On my machine, this code takes 0.0046 seconds to execute. Compare this to the checkpointed version, again with the pause statement removed.

%addup_checkpoint_nopause.m if exist( 'checkpoint.mat','file' ) % If a checkpoint file exists, load it fprintf('Checkpoint file found - Loading\n'); load('checkpoint.mat') else %otherwise, start from the beginning fprintf('No checkpoint file found - starting from beginning\n'); mysum=0; countmin=1; end for count = countmin:100 mysum = mysum + count; %save checkpoint countmin = count+1; %If we load this checkpoint, we want to start on the next iteration fprintf('Saving checkpoint\n'); save('checkpoint.mat'); fprintf('Completed iteration %d \n',count); end fprintf('The sum is %f \n',mysum);

This checkpointed version takes 0.85 seconds to execute on the same machine — Over 180 times slower than the original! The problem is that the time it takes to checkpoint is long compared to the calculation time.

If we make a modification so that we only checkpoint every 25 iterations, code execution time comes down to 0.05 seconds:

%Checkpoint every 25 iterations if exist( 'checkpoint.mat','file' ) % If a checkpoint file exists, load it fprintf('Checkpoint file found - Loading\n'); load('checkpoint.mat') else %otherwise, start from the beginning fprintf('No checkpoint file found - starting from beginning\n'); mysum=0; countmin=1; end for count = countmin:100 mysum = mysum + count; countmin = count+1; %If we load this checkpoint, we want to start on the next iteration if mod(count,25)==0 %save checkpoint fprintf('Saving checkpoint\n'); save('checkpoint.mat'); end fprintf('Completed iteration %d \n',count); end fprintf('The sum is %f \n',mysum);

Of course, the issue now is that we might lose more work if our program is interrupted between checkpoints. Additionally, in this particular case, the mod command used to decide whether or not to checkpoint is more expensive than simply performing the calculation but hopefully that isn’t going to be the case when working with real world calculations.

In practice, we have to work out a balance such that we checkpoint often enough so that we don’t stand to lose too much work but not so often that our program runs too slowly.

**Checkpointing code that involves random numbers**

Extra care needs to be taken when running code that involves random numbers. Consider a modification of our checkpointed adding program that creates a sum of random numbers.

%addup_checkpoint_rand.m %Adding random numbers the slow way, in order to demo checkpointing %This version has a bug if exist( 'checkpoint.mat','file' ) % If a checkpoint file exists, load it fprintf('Checkpoint file found - Loading\n'); load('checkpoint.mat') else %otherwise, start from the beginning fprintf('No checkpoint file found - starting from beginning\n'); mysum=0; countmin=1; rng(0); %Seed the random number generator for reproducible results end for count = countmin:100 mysum = mysum + rand(); countmin = count+1; %If we load this checkpoint, we want to start on the next iteration pause(1); %pretend this is a complicated calculation %save checkpoint fprintf('Saving checkpoint\n'); save('checkpoint.mat'); fprintf('Completed iteration %d \n',count); end fprintf('The sum is %f \n',mysum);

In the above, we set the seed of the random number generator to 0 at the beginning of the calculation. This ensures that we always get the same set of random numbers and allows us to get reproducible results. As such, the sum should always come out to be 52.799447 to the number of decimal places used in the program.

The above code has a subtle bug that you won’t find if your testing is confined to interrupting using CTRL-C and then restarting in an interactive session of MATLAB. Proceed that way, and you’ll get exactly the sum you’ll expect : 52.799447. If, on the other hand, you test your code by doing the following

- Run for a few iterations
- Interrupt with CTRL-C
- Restart MATLAB
- Run the code again, ensuring that it starts from the checkpoint

You’ll get a different result. This is not what we want!

The root cause of this problem is that we are not saving the state of the random number generator in our checkpoint file. Thus, when we restart MATLAB, all information concerning this state is lost. If we don’t restart MATLAB between interruptions, the state of the random number generator is safely tucked away behind the scenes.

Assume, for example, that you stop the calculation running after the third iteration. The random numbers you’d have consumed would be (to 4 d.p.)

0.8147

0.9058

0.1270

Your checkpoint file will contain the variables **mysum**, **count** and **countmin** but will contain nothing about the state of the random number generator. In English, this state is something like *‘The next random number will be the 4th one in the sequence defined by a starting seed of 0.’*

When we restart MATLAB, the default seed is 0 so we’ll be using the right sequence (since we explicitly set it to be 0 in our code) but we’ll be starting right from the beginning again. That is, the 4th,5th and 6th iterations of the summation will contain the first 3 numbers in the stream, thus double counting them, and so our checkpointing procedure will alter the results of the calculation.

In order to fix this, we need to additionally save the state of the random number generator when we save a checkpoint and also make correct use of this on restarting. Here’s the code

%addup_checkpoint_rand_correct.m %Adding random numbers the slow way, in order to demo checkpointing if exist( 'checkpoint.mat','file' ) % If a checkpoint file exists, load it fprintf('Checkpoint file found - Loading\n'); load('checkpoint.mat') %use the saved RNG state stream = RandStream.getGlobalStream; stream.State = savedState; else % otherwise, start from the beginning fprintf('No checkpoint file found - starting from beginning\n'); mysum=0; countmin=1; rng(0); %Seed the random number generator for reproducible results end for count = countmin:100 mysum = mysum + rand(); countmin = count+1; %If we load this checkpoint, we want to start on the next iteration pause(1); %pretend this is a complicated calculation %save the state of the random number genertor stream = RandStream.getGlobalStream; savedState = stream.State; %save checkpoint fprintf('Saving checkpoint\n'); save('checkpoint.mat'); fprintf('Completed iteration %d \n',count); end fprintf('The sum is %f \n',mysum);

**Ensuring that the checkpoint save completes**

Events that terminate our code can occur extremely quickly — a powercut for example. There is a risk that the machine was switched off while our check-point file was being written. How can we ensure that the file is complete?

The solution, which I found on the MATLAB checkpointing page of the Liverpool University Condor Pool site is to first write a temporary file and then rename it. That is, instead of

save('checkpoint.mat')/pre>

we do

if strcmp(computer,'PCWIN64') || strcmp(computer,'PCWIN') %We are running on a windows machine system( 'move /y checkpoint_tmp.mat checkpoint.mat' ); else %We are running on Linux or Mac system( 'mv checkpoint_tmp.mat checkpoint.mat' ); end

As the author of that page explains *‘The operating system should guarantee that the move command is “atomic” (in the sense that it is indivisible i.e. it succeeds completely or not at all) so that there is no danger of receiving a corrupt “half-written” checkpoint file from the job.’*

**Only checkpoint what is necessary**

So far, we’ve been saving the entire MATLAB workspace in our checkpoint files and this hasn’t been a problem since our workspace hasn’t contained much. In general, however, the workspace might contain all manner of intermediate variables that we simply don’t need in order to restart where we left off. Saving the stuff that we might not need can be expensive.

For the sake of illustration, let’s skip 100 million random numbers before adding one to our sum. For reasons only known to ourselves, we store these numbers in an intermediate variable which we never do anything with. This array isn’t particularly large at 763 Megabytes but its existence slows down our checkpointing somewhat. The correct result of this variation of the calculation is 41.251376 if we set the starting seed to 0; something we can use to test our new checkpoint strategy.

Here’s the code

% A demo of how slow checkpointing can be if you include large intermediate variables if exist( 'checkpoint.mat','file' ) % If a checkpoint file exists, load it fprintf('Checkpoint file found - Loading\n'); load('checkpoint.mat') %use the saved RNG state stream = RandStream.getGlobalStream; stream.State = savedState; else %otherwise, start from the beginning fprintf('No checkpoint file found - starting from beginning\n'); mysum=0; countmin=1; rng(0); %Seed the random number generator for reproducible results end for count = countmin:100 %Create and store 100 million random numbers for no particular reason randoms = rand(10000); mysum = mysum + rand(); countmin = count+1; %If we load this checkpoint, we want to start on the next iteration fprintf('Completed iteration %d \n',count); if mod(count,25)==0 %save the state of the random number generator stream = RandStream.getGlobalStream; savedState = stream.State; %save and time checkpoint tic save('checkpoint_tmp.mat'); if strcmp(computer,'PCWIN64') || strcmp(computer,'PCWIN') %We are running on a windows machine system( 'move /y checkpoint_tmp.mat checkpoint.mat' ); else %We are running on Linux or Mac system( 'mv checkpoint_tmp.mat checkpoint.mat' ); end timing = toc; fprintf('Checkpoint save took %f seconds\n',timing); end end fprintf('The sum is %f \n',mysum);

On my Windows 7 Desktop, each checkpoint save takes around 17 seconds:

Completed iteration 25 1 file(s) moved. Checkpoint save took 17.269897 seconds

It is not necessary to include that huge random matrix in a checkpoint file. If we are explicit in what we require, we can reduce the time taken to checkpoint significantly. Here, we change

save('checkpoint_tmp.mat');

to

save('checkpoint_tmp.mat','mysum','countmin','savedState');

This has a dramatic effect on check-pointing time:

Completed iteration 25 1 file(s) moved. Checkpoint save took 0.033576 seconds

Here’s the final piece of code that uses everything discussed in this article

%Final checkpointing demo if exist( 'checkpoint.mat','file' ) % If a checkpoint file exists, load it fprintf('Checkpoint file found - Loading\n'); load('checkpoint.mat') %use the saved RNG state stream = RandStream.getGlobalStream; stream.State = savedState; else %otherwise, start from the beginning fprintf('No checkpoint file found - starting from beginning\n'); mysum=0; countmin=1; rng(0); %Seed the random number generator for reproducible results end for count = countmin:100 %Create and store 100 million random numbers for no particular reason randoms = rand(10000); mysum = mysum + rand(); countmin = count+1; %If we load this checkpoint, we want to start on the next iteration fprintf('Completed iteration %d \n',count); if mod(count,25)==0 %checkpoint every 25th iteration %save the state of the random number generator stream = RandStream.getGlobalStream; savedState = stream.State; %save and time checkpoint tic %only save the variables that are strictly necessary save('checkpoint_tmp.mat','mysum','countmin','savedState'); %Ensure that the save completed if strcmp(computer,'PCWIN64') || strcmp(computer,'PCWIN') %We are running on a windows machine system( 'move /y checkpoint_tmp.mat checkpoint.mat' ); else %We are running on Linux or Mac system( 'mv checkpoint_tmp.mat checkpoint.mat' ); end timing = toc; fprintf('Checkpoint save took %f seconds\n',timing); end end fprintf('The sum is %f \n',mysum);

**Parallel checkpointing**

If your code includes parallel regions using constructs such as parfor or spmd, you might have to do more work to checkpoint correctly. I haven’t considered any of the potential issues that may arise in such code in this article

**Checkpointing checklist**

Here’s a reminder of everything you need to consider

- Test to ensure that the introduction of checkpointing doesn’t alter results
- Don’t checkpoint too often
- Take care when checkpointing code that involves random numbers – you need to explicitly save the state of the random number generator.
- Take measures to ensure that the checkpoint save is completed
- Only checkpoint what is necessary
- Code that includes parallel regions might require extra care

Octave is a free, open source language for numerical computing that is mostly compatible with MATLAB. For years, the official Octave project has been strictly command-line only which puts many users off — particularly those who were used to the graphical user interface (GUI) of MATLAB. Unofficial GUIs have come and gone over the years but none of them were fully satisfactory for one reason or another.

As of the 3.8 release of Octave on 31st December 2013, this all changed when Octave started shipping with its own, official GUI. It is currently considered as ‘experimental’ by the Octave team and is obviously rough around the edges here and there but it is already quite usable.

The system includes

- An editor with syntax highlighting and code folding
- A debugger
- A file browser and workspace viewer
- The ability to hide and move different elements of the GUI around (e.g. you could swap the positions of the workspace and File Browser, or tear-off the editor into its own Window)
- A documentation browser
- The Octave command window

I’ve spent an hour or so playing with it today and like it a lot!

Thanks to Júlio Hoffimann Mendes who let me know about this new release.

**What is Software Carpentry?**

Software Carpentry boot-camps aim to ensure that researchers have a working knowledge of several useful technologies from the world of software development. Concepts such as version control, unit-testing, task-automation and modular programming are bread and butter to full time software-developers but are often completely unknown to many researchers; researchers who are nevertheless expected to develop computer code as part of their research output.

An in-depth education in all of these technologies can take a lot of time; time that many researchers simply can’t spare. Fortunately, however, it is possible to learn just enough to completely transform the quality of your workflow in a relatively short amount of time. Software Carpentry boot camps aim to lay the foundations in around two days.

**Software Carpentry…but in MATLAB**

Traditionally, these bootcamps are taught using open-source languages such as Python or R but many proprietary languages are also used in academia such as MATLAB, IDL, Mathematica and Maple to name but a few. In the most recent version of MATLAB, Mathworks introduced a unit testing framework and so now seemed like a good time to try out Software Carpentry in MATLAB.

On Tuesday 14th January I hosted the first ever MATLAB-based Software Carpentry bootcamp in conjunction with The Software Sustainability Institute (of which I am a 2013 fellow) and Mathworks. Held at The University of Manchester, this two-day event gave free software development instruction to researchers from a wide variety of disciplines and, if the feedback forms are to be believed, was a resounding success.

Shoaib Sufi and Aleksandra Pawlik of the Software Sustainability Institute taught material on bash shell scripting and git version control respectively with Ken Deeley of Mathworks providing MATLAB instruction. I, along with Mathworks’ Jos Martin and Juan Martinez, acted as the less-than-glamourous but ever-helpful classroom assistants.

**Fun and games with BYOD – Licensing**

For the decade or so that I’ve been teaching programming, I’ve done so in fully equipped teaching labs containing row upon row of identical computers where each have all the required software pre-installed. For this event, we decided to try a Bring Your Own Device (BYOD) approach..i.e. students bring their own, personal laptops and we provided a list of required software that needed to be installed for the event. Since Manchester’s MATLAB site license is network-based only and does not allow students to install on personally owned equipment, Mathworks kindly supplied standalone trial licenses for all course attendees.

BYOD has a number of benefits for the student and, in my mind at least, the most important of these benefits is that students are left with a fully working development environment after the course. This means that they can start applying their new skills immediately after leaving the course which hopefully leads to them being used ‘in anger’ on their research projects. Since the students were only supplied with trial licenses, this was not to be the case with MATLAB. They will have to switch to using their on-campus machines or purchase their own copy of ‘standalone’ MATLAB in order to continue working.

Availability of software is not something that usually concerns an organiser of a Software Carpentry boot camp since the likes of Python and R are available everywhere for free. When using a proprietary language such as MATLAB, however, it’s very much of an issue and I’d advise any future organiser of a MATLAB boot camp to carefully consider this before proceeding. Of course this isn’t just true of MATLAB, it’s true of any proprietary language one may choose. It will also be less of a problem if your institution has an all you can eat, ‘Total Academic Headcount’ unlimited site license.

**Fun and games with BYOD – switches and glitches**

All three major desktop operating systems were represented in the laptops of the 20 or so students — something that occasionally made for fun times. Here is a list of some of the minor issues that arose over the two days

- Teaching bash scripting to Windows users always makes me wince a little. Environments such as Cygwin and git bash are great and a little bit of bash never hurt anyone but I can’t help but wonder if we should be teaching a native scripting language instead such as PowerShell. After all, Linux users would be surprised if they came to a scripting seminar and we made them use pash! Of course, if the student ever gets access to a HPC system, it is highly probable that it will be running some variant of Linux and so perhaps everyone should learn at least a little bash.
- One Mac user had some fancy graphical overlay program which jazzed up his desktop. Looked great but it turned out that it sometimes caused MATLAB to crash.
- Some of the MATLAB commands that interacted with the file system worked slightly differently across the three operating systems…something we hadn’t appreciated ahead of time. This caused delays while we figured out a platform-independent way to proceed.
- Some of the Linux machines exposed a bug in TLS (thread local storage) in Intel’s MKL which caused errors in MATLAB.
- MATLAB’s keyboard shortcuts are different across operating systems. It is possible to change the behaviour but MATLABers who’ve been around a while didn’t want to. This sometimes led to confusion if the instructor mentioned an explicit keyboard shortcut that happens to be different on the students machine.

None of these issues were particularly major but the need to consider and resolve them did take up the time of instructors and demonstrators. In a lab-setting, this extra work would not be necessary. Obviously we’ll fix some of the above issues in the next iteration of the course but I believe that BYOD sessions will always require a higher number of demonstrators/glamorous assistants than more traditional lab-sessions simply because of the inevitable variation of software and hardware.

**No toolboxes….almost!**

I can see my Mathworks friends rolling their eyes already but I have to get this off my chest. Regular readers of this blog will know that I have got some issues with Mathworks’ toolbox system in MATLAB. I *really* wanted a software carpentry course that only included pure MATLAB–no toolboxes at all. In the event, we decided on a set of course materials that required the use of the statistics toolbox because it made certain things so much easier.

For example, MATLAB uses NaNs to represent missing data — a design choice that quickly leads to the desire for functions such as nanmean and nanmedian. Unfortunately, despite their simplicity, these functions are in the statistics toolbox and so using them leads to less accessible software (since your users need to have that extra toolbox). Of course you could code up your own versions, or use free implementations easily enough but then you are not using idiomatic MATLAB. All very frustrating. This wasn’t the only reason why we added the statistics toolbox to the list of requirements of course but hopefully makes my point.

OK, moan over, I’m done….for now.

**Course Material**

The main course page is at http://apawlik.github.io/2014-01-14-manchester/ and the course material for version control using git and shell scripting using bash are already available. We hope to have the MATLAB material (which was developed by Mathworks) available in the near future once we’ve sorted out some legal issues.

The tutorials were very interactive with the instructors demonstrating commands while students followed along on their own machines–barely a slide to be seen, just how I like it. There were regular exercises with a high ratio of demonstrators to students. This was very much a hands-on course which, in my opinion, is the best way to learn programming.

One of the advantages of instructor-led courses is that students can go off-piste if they wish and learn all sorts of extra material. One student, for example, asked senior Mathworks developer, Jos Martin, for a quick code-review of her research simulation at the end of the first day. Listening to Jos’ critique of the code was instructive for several of us. Jos is also the lead developer of MATLAB’s parallel computing toolbox — something I took advantage of by asking him questions relevant to a code-optimization project I’m currently working on.

It’s very difficult to replicate this kind of interaction in online courses!

**Useful teaching technology**

The instructors used a couple of pieces of software that I believe significantly enhanced the teaching experience and I plan to use them in future courses of my own.

- ZoomIt - This is a free presentation tool that allows you to zoom into any area of the screen and subsequently annotate it. This might not sound like much but it can really improve a presentation. Here’s a video of some of its functionality.
- Etherpad – Etherpad is a web based collaborative document editor which we used to post code snippets, links and anything else that anyone felt was useful on the day. We also used it as a chat room which was sometimes useful.
- PostIt notes – Very low tech but very effective. Every student had red and green sticky notes. If a student had no sticky note on the back of their laptop, it meant they were working and would prefer to be left alone. Green means that they had completed the exercise and red (or in our case, orange) means ‘I want help’.

**Feedback and the future**

Both Mathworks and the Software Sustainability Institute asked students to fill out their own course feedback forms. Since the students had had such a great experience, they were all happy to do so but whittling those two feedback forms down to one that satisfies both institutions is something we should definitely aim for! The feedback was extremely positive with almost everyone agreeing that they had got a lot out of the course.

Of course, it wasn’t perfect and there are several things we could do to make it better for next time:-

- Some of the ‘Fun and Games with BYOD’ described earlier could have been avoided by modifying the course material a little.
- The individual sections on bash, git and MATLAB were great but I felt that they needed to be tied together better. They felt too much like self-contained mini-courses and it wouldn’t take much extra effort on our part to link them together a little more.
- The course was free and places were strictly limited. A couple of people cancelled less than 24 hours before the first day which didn’t give us enough time to offer the places out to others. Worse still, a few more people simply didn’t bother to turn up and didn’t offer any explanation. I find that this often happens with free courses and I’ve yet to figure out a way to improve the situation.
- We need to get all of the course material on line!

All told, the course was a great success and I hope to be running more of them in the future. A huge thanks to the instructors, Shoaib Sufi, Alexandra Pawlik and Ken Deeley who did the lion’s share of the hard work on the day but also to Mathworks’ Jos Martin and Juan Martinez who did a great job of helping out students with the exercises, answering questions and generally ensuring that everything ran as smoothly as possible. I’d also like to thank Mathworks’ Stasi Revel and Tanya Morton who helped out with sorting out trial licenses and, finally, thanks to Software Carpentry’s Greg Wilson who gave support and advice in the weeks leading to the event.

In a recent Stack Overflow query, someone asked if you could switch off the balancing step when calculating eigenvalues in Python. In the document A case where balancing is harmful, David S. Watkins describes the balancing step as* ‘the input matrix A is replaced by a rescaled matrix A* = D ^{-1}AD, where D is a diagonal matrix chosen so that, for each i, the ith row and the ith column of A* have roughly the same norm.’ *

Such balancing is usually very useful and so is performed by default by software such as MATLAB or Numpy. There are times, however, when one would like to switch it off.

In MATLAB, this is easy and the following is taken from the online MATLAB documentation

A = [ 3.0 -2.0 -0.9 2*eps; -2.0 4.0 1.0 -eps; -eps/4 eps/2 -1.0 0; -0.5 -0.5 0.1 1.0]; [VN,DN] = eig(A,'nobalance') VN = 0.6153 -0.4176 -0.0000 -0.1528 -0.7881 -0.3261 0 0.1345 -0.0000 -0.0000 -0.0000 -0.9781 0.0189 0.8481 -1.0000 0.0443 DN = 5.5616 0 0 0 0 1.4384 0 0 0 0 1.0000 0 0 0 0 -1.0000

At the time of writing, it is not possible to directly do this in Numpy (as far as I know at least). Numpy’s eig command currently uses the LAPACK routine DGEEV to do the heavy lifting for double precision matrices. We can see this by looking at the source code of numpy.linalg.eig where the relevant subsection is

lapack_routine = lapack_lite.dgeev wr = zeros((n,), t) wi = zeros((n,), t) vr = zeros((n, n), t) lwork = 1 work = zeros((lwork,), t) results = lapack_routine(_N, _V, n, a, n, wr, wi, dummy, 1, vr, n, work, -1, 0) lwork = int(work[0]) work = zeros((lwork,), t) results = lapack_routine(_N, _V, n, a, n, wr, wi, dummy, 1, vr, n, work, lwork, 0)

My plan was to figure out how to tell DGEEV not to perform the balancing step and I’d be done. Sadly, however, it turns out that this is not possible. Taking a look at the reference implementation of DGEEV, we can see that the balancing step is **always performed **and is not user controllable–here’s the relevant bit of Fortran

* Balance the matrix * (Workspace: need N) * IBAL = 1 CALL DGEBAL( 'B', N, A, LDA, ILO, IHI, WORK( IBAL ), IERR )

So, using DGEEV is a dead-end unless we are willing to modifiy and recompile the lapack source — something that’s rarely a good idea in my experience. There is another LAPACK routine that is of use, however, in the form of DGEEVX that allows us to control balancing. Unfortunately, this routine is not part of the **numpy.linalg.lapack_lite** interface provided by Numpy and I’ve yet to figure out how to add extra routines to it.

I’ve also discovered that this functionality is an open feature request in Numpy.

**Enter the NAG Library**

My University has a site license for the commercial Numerical Algorithms Group (NAG) library. Among other things, NAG offers an interface to all of LAPACK along with an interface to Python. So, I go through the installation and do

import numpy as np from ctypes import * from nag4py.util import Nag_RowMajor,Nag_NoBalancing,Nag_NotLeftVecs,Nag_RightVecs,Nag_RCondEigVecs,Integer,NagError,INIT_FAIL from nag4py.f08 import nag_dgeevx eps = np.spacing(1) np.set_printoptions(precision=4,suppress=True) def unbalanced_eig(A): """ Compute the eigenvalues and right eigenvectors of a square array using DGEEVX via the NAG library. Requires the NAG C library and NAG's Python wrappers http://www.nag.co.uk/python.asp The balancing step that's performed in DGEEV is not performed here. As such, this function is the same as the MATLAB command eig(A,'nobalance') Parameters ---------- A : (M, M) Numpy array A square array of real elements. On exit: A is overwritten and contains the real Schur form of the balanced version of the input matrix . Returns ------- w : (M,) ndarray The eigenvalues v : (M, M) ndarray The eigenvectors Author: Mike Croucher (www.walkingrandomly.com) Testing has been mimimal """ order = Nag_RowMajor balanc = Nag_NoBalancing jobvl = Nag_NotLeftVecs jobvr = Nag_RightVecs sense = Nag_RCondEigVecs n = A.shape[0] pda = n pdvl = 1 wr = np.zeros(n) wi = np.zeros(n) vl=np.zeros(1); pdvr = n vr = np.zeros(pdvr*n) ilo=c_long(0) ihi=c_long(0) scale = np.zeros(n) abnrm = c_double(0) rconde = np.zeros(n) rcondv = np.zeros(n) fail = NagError() INIT_FAIL(fail) nag_dgeevx(order,balanc,jobvl,jobvr,sense, n, A.ctypes.data_as(POINTER(c_double)), pda, wr.ctypes.data_as(POINTER(c_double)), wi.ctypes.data_as(POINTER(c_double)),vl.ctypes.data_as(POINTER(c_double)),pdvl, vr.ctypes.data_as(POINTER(c_double)),pdvr,ilo,ihi, scale.ctypes.data_as(POINTER(c_double)), abnrm, rconde.ctypes.data_as(POINTER(c_double)),rcondv.ctypes.data_as(POINTER(c_double)),fail) if all(wi == 0.0): w = wr v = vr.reshape(n,n) else: w = wr+1j*wi v = array(vr, w.dtype).reshape(n,n) return(w,v)

Define a test matrix:

A = np.array([[3.0,-2.0,-0.9,2*eps], [-2.0,4.0,1.0,-eps], [-eps/4,eps/2,-1.0,0], [-0.5,-0.5,0.1,1.0]])

Do the calculation

(w,v) = unbalanced_eig(A)

which gives

(array([ 5.5616, 1.4384, 1. , -1. ]), array([[ 0.6153, -0.4176, -0. , -0.1528], [-0.7881, -0.3261, 0. , 0.1345], [-0. , -0. , -0. , -0.9781], [ 0.0189, 0.8481, -1. , 0.0443]]))

This is exactly what you get by running the MATLAB command **eig(A,’nobalance’).**

Note that unbalanced_eig(A) **changes the input matri**x A to

array([[ 5.5616, -0.0662, 0.0571, 1.3399], [ 0. , 1.4384, 0.7017, -0.1561], [ 0. , 0. , 1. , -0.0132], [ 0. , 0. , 0. , -1. ]])

According to the NAG documentation, this is the real Schur form of the balanced version of the input matrix. I can’t see how to ask NAG to not do this. I guess that if it’s not what you want unbalanced_eig() to do, you’ll need to pass a copy of the input matrix to NAG.

**The IPython notebook**

The code for this article is available as an IPython Notebook

**The future**

This blog post was written using Numpy version 1.7.1. There is an enhancement request for the functionality discussed in this article open in Numpy’s git repo and so I expect this article to become redundant pretty soon.

Back in 2008, I featured various mathematical Christmas cards generated from mathematical software. Here are the original links (including full source code for each ‘card’) along with a few of the images. Contact me if you’d like to add something new.

The R code used for this example comes from Barry Rowlingson, so huge thanks to him.

A question I get asked a lot is ‘How can I do nonlinear least squares curve fitting in X?’ where X might be MATLAB, Mathematica or a whole host of alternatives. Since this is such a common query, I thought I’d write up how to do it for a very simple problem in several systems that I’m interested in

This is the R version. For other versions,see the list below

- Simple nonlinear least squares curve fitting in Julia
- Simple nonlinear least squares curve fitting in Maple
- Simple nonlinear least squares curve fitting in Mathematica
- Simple nonlinear least squares curve fitting in MATLAB
- Simple nonlinear least squares curve fitting in Python

**The problem**

xdata = -2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9 ydata = 0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001

and you’d like to fit the function

using nonlinear least squares. You’re starting guesses for the parameters are p1=1 and P2=0.2

For now, we are primarily interested in the following results:

- The fit parameters
- Sum of squared residuals
- Parameter confidence intervals

Future updates of these posts will show how to get other results. Let me know what you are most interested in.

**Solution in R**

# construct the data vectors using c() xdata = c(-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9) ydata = c(0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001) # look at it plot(xdata,ydata) # some starting values p1 = 1 p2 = 0.2 # do the fit fit = nls(ydata ~ p1*cos(p2*xdata) + p2*sin(p1*xdata), start=list(p1=p1,p2=p2)) # summarise summary(fit)

This gives

Formula: ydata ~ p1 * cos(p2 * xdata) + p2 * sin(p1 * xdata) Parameters: Estimate Std. Error t value Pr(>|t|) p1 1.881851 0.027430 68.61 2.27e-12 *** p2 0.700230 0.009153 76.51 9.50e-13 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.08202 on 8 degrees of freedom Number of iterations to convergence: 7 Achieved convergence tolerance: 2.189e-06

Draw the fit on the plot by getting the prediction from the fit at 200 x-coordinates across the range of xdata

new = data.frame(xdata = seq(min(xdata),max(xdata),len=200)) lines(new$xdata,predict(fit,newdata=new))

Getting the sum of squared residuals is easy enough:

sum(resid(fit)^2)

Which gives

[1] 0.0538127

Finally, lets get the parameter confidence intervals.

confint(fit)

Which gives

Waiting for profiling to be done... 2.5% 97.5% p1 1.8206081 1.9442365 p2 0.6794193 0.7209843

A question I get asked a lot is ‘How can I do nonlinear least squares curve fitting in X?’ where X might be MATLAB, Mathematica or a whole host of alternatives. Since this is such a common query, I thought I’d write up how to do it for a very simple problem in several systems that I’m interested in

This is the MATLAB version. For other versions,see the list below

- Simple nonlinear least squares curve fitting in Julia
- Simple nonlinear least squares curve fitting in Maple
- Simple nonlinear least squares curve fitting in Mathematica
- Simple nonlinear least squares curve fitting in Python
- Simple nonlinear least squares curve fitting in R

**The problem**

xdata = -2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9 ydata = 0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001

and you’d like to fit the function

using nonlinear least squares. You’re starting guesses for the parameters are p1=1 and P2=0.2

For now, we are primarily interested in the following results:

- The fit parameters
- Sum of squared residuals

Future updates of these posts will show how to get other results such as confidence intervals. Let me know what you are most interested in.

How you proceed depends on which toolboxes you have. Contrary to popular belief, you don’t need the Curve Fitting toolbox to do curve fitting…particularly when the fit in question is as basic as this. Out of the 90+ toolboxes sold by The Mathworks, I’ve only been able to look through the subset I have access to so I may have missed some alternative solutions.

**Pure MATLAB solution (No toolboxes)**

In order to perform nonlinear least squares curve fitting, you need to minimise the squares of the residuals. This means you need a minimisation routine. Basic MATLAB comes with the fminsearch function which is based on the Nelder-Mead simplex method. For this particular problem, it works OK but will not be suitable for more complex fitting problems. Here’s the code

format compact format long xdata = [-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9]; ydata = [0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001]; %Function to calculate the sum of residuals for a given p1 and p2 fun = @(p) sum((ydata - (p(1)*cos(p(2)*xdata)+p(2)*sin(p(1)*xdata))).^2); %starting guess pguess = [1,0.2]; %optimise [p,fminres] = fminsearch(fun,pguess)

This gives the following results

p = 1.881831115804464 0.700242006994123 fminres = 0.053812720914713

All we get here are the parameters and the sum of squares of the residuals. If you want more information such as 95% confidence intervals, you’ll have a lot more hand-coding to do. Although fminsearch works fine in this instance, it soon runs out of steam for more complex problems.

**MATLAB with optimisation toolbox**

With respect to this problem, the optimisation toolbox gives you two main advantages over pure MATLAB. The first is that better optimisation routines are available so more complex problems (such as those with constraints) can be solved and in less time. The second is the provision of the lsqcurvefit function which is specifically designed to solve curve fitting problems. Here’s the code

format long format compact xdata = [-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9]; ydata = [0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001]; %Note that we don't have to explicitly calculate residuals fun = @(p,xdata) p(1)*cos(p(2)*xdata)+p(2)*sin(p(1)*xdata); %starting guess pguess = [1,0.2]; [p,fminres] = lsqcurvefit(fun,pguess,xdata,ydata)

This gives the results

p = 1.881848414551983 0.700229137656802 fminres = 0.053812696487326

**MATLAB with statistics toolbox**

There are two interfaces I know of in the stats toolbox and both of them give a lot of information about the fit. The problem set up is the same in both cases

%set up for both fit commands in the stats toolbox xdata = [-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9]; ydata = [0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001]; fun = @(p,xdata) p(1)*cos(p(2)*xdata)+p(2)*sin(p(1)*xdata); pguess = [1,0.2];

Method 1 makes use of **NonLinearModel.fit**

mdl = NonLinearModel.fit(xdata,ydata,fun,pguess)

The returned object is a **NonLinearModel** object

class(mdl) ans = NonLinearModel

which contains all sorts of useful stuff

mdl = Nonlinear regression model: y ~ p1*cos(p2*xdata) + p2*sin(p1*xdata) Estimated Coefficients: Estimate SE p1 1.8818508110535 0.027430139389359 p2 0.700229815076442 0.00915260662357553 tStat pValue p1 68.6052223191956 2.26832562501304e-12 p2 76.5060538352836 9.49546284187105e-13 Number of observations: 10, Error degrees of freedom: 8 Root Mean Squared Error: 0.082 R-Squared: 0.996, Adjusted R-Squared 0.995 F-statistic vs. zero model: 1.43e+03, p-value = 6.04e-11

If you don’t need such heavyweight infrastructure, you can make use of the statistic toolbox’s **nlinfit** function

[p,R,J,CovB,MSE,ErrorModelInfo] = nlinfit(xdata,ydata,fun,pguess);

Along with our parameters (p) this also provides the residuals (R), Jacobian (J), Estimated variance-covariance matrix (CovB), Mean Squared Error (MSE) and a structure containing information about the error model fit (ErrorModelInfo).

Both **nlinfit** and **NonLinearModel.fit** use the same minimisation algorithm which is based on Levenberg-Marquardt

**MATLAB with Symbolic Toolbox**

MATLAB’s symbolic toolbox provides a completely separate computer algebra system called Mupad which can handle nonlinear least squares fitting via its stats::reg function. Here’s how to solve our problem in this environment.

First, you need to start up the mupad application. You can do this by entering

mupad

into MATLAB. Once you are in mupad, the code looks like this

xdata := [-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9]: ydata := [0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001]: stats::reg(xdata,ydata,p1*cos(p2*x)+p2*sin(p1*x),[x],[p1,p2],StartingValues=[1,0.2])

The result returned is

[[1.88185085, 0.7002298172], 0.05381269642]

These are our fitted parameters,p1 and p2, along with the sum of squared residuals. The documentation tells us that the optimisation algorithm is Levenberg-Marquardt– this is rather better than the simplex algorithm used by basic MATLAB’s fminsearch.

**MATLAB with the NAG Toolbox**

The NAG Toolbox for MATLAB is a commercial product offered by the UK based Numerical Algorithms Group. Their main products are their C and Fortran libraries but they also have a comprehensive MATLAB toolbox that contains something like 1500+ functions. My University has a site license for pretty much everything they put out and we make great use of it all. One of the benefits of the NAG toolbox over those offered by The Mathworks is speed. NAG is often (but not always) faster since its based on highly optimized, compiled Fortran. One of the problems with the NAG toolbox is that it is difficult to use compared to Mathworks toolboxes.

In an earlier blog post, I showed how to create wrappers for the NAG toolbox to create an easy to use interface for basic nonlinear curve fitting. Here’s how to solve our problem using those wrappers.

format long format compact xdata = [-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9]; ydata = [0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001]; %Note that we don't have to explicitly calculate residuals fun = @(p,xdata) p(1)*cos(p(2)*xdata)+p(2)*sin(p(1)*xdata); start = [1,0.2]; [p,fminres]=nag_lsqcurvefit(fun,start,xdata,ydata)

which gives

Warning: nag_opt_lsq_uncon_mod_func_easy (e04fy) returned a warning indicator (5) p = 1.881850904268710 0.700229557886739 fminres = 0.053812696425390

For convenience, here’s the two files you’ll need to run the above (you’ll also need the NAG Toolbox for MATLAB of course)

**MATLAB with curve fitting toolbox**

One would expect the curve fitting toolbox to be able to fit such a simple curve and one would be right :)

xdata = [-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9]; ydata = [0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001]; opt = fitoptions('Method','NonlinearLeastSquares',... 'Startpoint',[1,0.2]); f = fittype('p1*cos(p2*x)+p2*sin(p1*x)','options',opt); fitobject = fit(xdata',ydata',f)

Note that, unlike every other Mathworks method shown here, xdata and ydata have to be column vectors. The result looks like this

fitobject = General model: fitobject(x) = p1*cos(p2*x)+p2*sin(p1*x) Coefficients (with 95% confidence bounds): p1 = 1.882 (1.819, 1.945) p2 = 0.7002 (0.6791, 0.7213)

fitobject is of type cfit:

class(fitobject) ans = cfit

In this case it contains two fields, p1 and p2, which are the parameters we are looking for

>> fieldnames(fitobject) ans = 'p1' 'p2' >> fitobject.p1 ans = 1.881848414551983 >> fitobject.p2 ans = 0.700229137656802

For maximum information, call the fit command like this:

[fitobject,gof,output] = fit(xdata',ydata',f) fitobject = General model: fitobject(x) = p1*cos(p2*x)+p2*sin(p1*x) Coefficients (with 95% confidence bounds): p1 = 1.882 (1.819, 1.945) p2 = 0.7002 (0.6791, 0.7213) gof = sse: 0.053812696487326 rsquare: 0.995722238905101 dfe: 8 adjrsquare: 0.995187518768239 rmse: 0.082015773244637 output = numobs: 10 numparam: 2 residuals: [10x1 double] Jacobian: [10x2 double] exitflag: 3 firstorderopt: 3.582047395989108e-05 iterations: 6 funcCount: 21 cgiterations: 0 algorithm: 'trust-region-reflective' message: [1x86 char]