## Simple nonlinear least squares curve fitting in MATLAB

December 6th, 2013

A question I get asked a lot is ‘How can I do nonlinear least squares curve fitting in X?’ where X might be MATLAB, Mathematica or a whole host of alternatives.  Since this is such a common query, I thought I’d write up how to do it for a very simple problem in several systems that I’m interested in

This is the MATLAB version. For other versions,see the list below

The problem

xdata = -2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9
ydata = 0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001

and you’d like to fit the function

using nonlinear least squares.  You’re starting guesses for the parameters are p1=1 and P2=0.2

For now, we are primarily interested in the following results:

• The fit parameters
• Sum of squared residuals

Future updates of these posts will show how to get other results such as confidence intervals. Let me know what you are most interested in.

How you proceed depends on which toolboxes you have. Contrary to popular belief, you don’t need the Curve Fitting toolbox to do curve fitting…particularly when the fit in question is as basic as this. Out of the 90+ toolboxes sold by The Mathworks, I’ve only been able to look through the subset I have access to so I may have missed some alternative solutions.

Pure MATLAB solution (No toolboxes)

In order to perform nonlinear least squares curve fitting, you need to minimise the squares of the residuals. This means you need a minimisation routine.  Basic MATLAB comes with the fminsearch function which is based on the Nelder-Mead simplex method.  For this particular problem, it works OK but will not be suitable for more complex fitting problems.  Here’s the code

format compact
format long
xdata = [-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9];
ydata = [0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001];

%Function to calculate the sum of residuals for a given p1 and p2
fun = @(p) sum((ydata - (p(1)*cos(p(2)*xdata)+p(2)*sin(p(1)*xdata))).^2);

%starting guess
pguess = [1,0.2];

%optimise
[p,fminres] = fminsearch(fun,pguess)

This gives the following results

p =
1.881831115804464 0.700242006994123
fminres =
0.053812720914713

All we get here are the parameters and the sum of squares of the residuals.  If you want more information such as 95% confidence intervals, you’ll have a lot more hand-coding to do. Although fminsearch works fine in this instance, it soon runs out of steam for more complex problems.

MATLAB with optimisation toolbox

With respect to this problem, the optimisation toolbox gives you two main advantages over pure MATLAB.  The first is that better optimisation routines are available so more complex problems (such as those with constraints) can be solved and in less time.  The second is the provision of the lsqcurvefit function which is specifically designed to solve curve fitting problems.  Here’s the code

format long
format compact
xdata = [-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9];
ydata = [0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001];

%Note that we don't have to explicitly calculate residuals
fun = @(p,xdata) p(1)*cos(p(2)*xdata)+p(2)*sin(p(1)*xdata);

%starting guess
pguess = [1,0.2];

[p,fminres] = lsqcurvefit(fun,pguess,xdata,ydata)

This gives the results

p =
1.881848414551983   0.700229137656802
fminres =
0.053812696487326

MATLAB with statistics toolbox

There are two interfaces I know of in the stats toolbox and both of them give a lot of information about the fit. The problem set up is the same in both cases

%set up for both fit commands in the stats toolbox
xdata = [-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9];
ydata = [0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001];

fun = @(p,xdata) p(1)*cos(p(2)*xdata)+p(2)*sin(p(1)*xdata);

pguess = [1,0.2];

Method 1 makes use of NonLinearModel.fit

mdl = NonLinearModel.fit(xdata,ydata,fun,pguess)

The returned object is a NonLinearModel object

class(mdl)
ans =
NonLinearModel

which contains all sorts of useful stuff

mdl =
Nonlinear regression model:
y ~ p1*cos(p2*xdata) + p2*sin(p1*xdata)

Estimated Coefficients:
Estimate             SE
p1      1.8818508110535      0.027430139389359
p2    0.700229815076442    0.00915260662357553

tStat               pValue
p1    68.6052223191956    2.26832562501304e-12
p2    76.5060538352836    9.49546284187105e-13

Number of observations: 10, Error degrees of freedom: 8
Root Mean Squared Error: 0.082
R-Squared: 0.996,  Adjusted R-Squared 0.995
F-statistic vs. zero model: 1.43e+03, p-value = 6.04e-11

If you don’t need such heavyweight infrastructure, you can make use of the statistic toolbox’s nlinfit function

[p,R,J,CovB,MSE,ErrorModelInfo] = nlinfit(xdata,ydata,fun,pguess);

Along with our parameters (p) this also provides the residuals (R), Jacobian (J), Estimated variance-covariance matrix (CovB), Mean Squared Error (MSE) and a structure containing information about the error model fit (ErrorModelInfo).

Both nlinfit and NonLinearModel.fit use the same minimisation algorithm which is based on Levenberg-Marquardt

MATLAB with Symbolic Toolbox

MATLAB’s symbolic toolbox provides a completely separate computer algebra system called Mupad which can handle nonlinear least squares fitting via its stats::reg function.  Here’s how to solve our problem in this environment.

First, you need to start up the mupad application.  You can do this by entering

mupad

into MATLAB. Once you are in mupad, the code looks like this

xdata := [-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9]:
ydata := [0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001]:
stats::reg(xdata,ydata,p1*cos(p2*x)+p2*sin(p1*x),[x],[p1,p2],StartingValues=[1,0.2])

The result returned is

[[1.88185085, 0.7002298172], 0.05381269642]

These are our fitted parameters,p1 and p2, along with the sum of squared residuals. The documentation tells us that the optimisation algorithm is Levenberg-Marquardt– this is rather better than the simplex algorithm used by basic MATLAB’s fminsearch.

MATLAB with the NAG Toolbox

The NAG Toolbox for MATLAB is a commercial product offered by the UK based Numerical Algorithms Group.  Their main products are their C and Fortran libraries but they also have a comprehensive MATLAB toolbox that contains something like 1500+ functions.  My University has a site license for pretty much everything they put out and we make great use of it all.  One of the benefits of the NAG toolbox over those offered by The Mathworks is speed.  NAG is often (but not always) faster since its based on highly optimized, compiled Fortran.  One of the problems with the NAG toolbox is that it is difficult to use compared to Mathworks toolboxes.

In an earlier blog post, I showed how to create wrappers for the NAG toolbox to create an easy to use interface for basic nonlinear curve fitting.  Here’s how to solve our problem using those wrappers.

format long
format compact
xdata = [-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9];
ydata = [0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001];

%Note that we don't have to explicitly calculate residuals
fun = @(p,xdata) p(1)*cos(p(2)*xdata)+p(2)*sin(p(1)*xdata);
start = [1,0.2];
[p,fminres]=nag_lsqcurvefit(fun,start,xdata,ydata)

which gives

Warning: nag_opt_lsq_uncon_mod_func_easy (e04fy) returned a warning indicator (5)
p =
1.881850904268710   0.700229557886739
fminres =
0.053812696425390

For convenience, here’s the two files you’ll need to run the above (you’ll also need the NAG Toolbox for MATLAB of course)

MATLAB with curve fitting toolbox

One would expect the curve fitting toolbox to be able to fit such a simple curve and one would be right :)

xdata = [-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9];
ydata = [0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001];

opt = fitoptions('Method','NonlinearLeastSquares',...
'Startpoint',[1,0.2]);
f = fittype('p1*cos(p2*x)+p2*sin(p1*x)','options',opt);

fitobject = fit(xdata',ydata',f)

Note that, unlike every other Mathworks method shown here, xdata and ydata have to be column vectors. The result looks like this

fitobject =

General model:
fitobject(x) = p1*cos(p2*x)+p2*sin(p1*x)
Coefficients (with 95% confidence bounds):
p1 =       1.882  (1.819, 1.945)
p2 =      0.7002  (0.6791, 0.7213)

fitobject is of type cfit:

 class(fitobject)

ans =
cfit

In this case it contains two fields, p1 and p2, which are the parameters we are looking for

>> fieldnames(fitobject)
ans =
'p1'
'p2'
>> fitobject.p1
ans =
1.881848414551983
>> fitobject.p2
ans =
0.700229137656802

For maximum information, call the fit command like this:

[fitobject,gof,output] = fit(xdata',ydata',f)

fitobject =

General model:
fitobject(x) = p1*cos(p2*x)+p2*sin(p1*x)
Coefficients (with 95% confidence bounds):
p1 =       1.882  (1.819, 1.945)
p2 =      0.7002  (0.6791, 0.7213)

gof =

sse: 0.053812696487326
rsquare: 0.995722238905101
dfe: 8
rmse: 0.082015773244637

output =

numobs: 10
numparam: 2
residuals: [10x1 double]
Jacobian: [10x2 double]
exitflag: 3
firstorderopt: 3.582047395989108e-05
iterations: 6
funcCount: 21
cgiterations: 0
algorithm: 'trust-region-reflective'
message: [1x86 char]

## Python division by zero is not IEEE compliant

October 10th, 2013

From the wikipedia page on Division by Zero: “The IEEE 754 standard specifies that every floating point arithmetic operation, including division by zero, has a well-defined result”.

MATLAB supports this fully:

>> 1/0
ans =
Inf
>> 1/(-0)
ans =
-Inf
>> 0/0
ans =
NaN

Julia is almost there, but doesn’t handled the signed 0 correctly (This is using Version 0.2.0-prerelease+3768 on Windows)

julia> 1/0
Inf

julia> 1/(-0)
Inf

julia> 0/0
NaN

Python throws an exception. (Python 2.7.5 using IPython shell)

In [4]: 1.0/0.0
---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
in ()
----> 1 1.0/0.0

ZeroDivisionError: float division by zero

In [5]: 1.0/(-0.0)
---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
in ()
----> 1 1.0/(-0.0)

ZeroDivisionError: float division by zero

In [6]: 0.0/0.0
---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
in ()
----> 1 0.0/0.0

ZeroDivisionError: float division by zero

Update:
Julia does do things correctly, provided I make it clear that I am working with floating point numbers:

julia> 1.0/0.0
Inf

julia> 1.0/(-0.0)
-Inf

julia> 0.0/0.0
NaN

## Lamenting the lack of multiple assignment in MATLAB

September 25th, 2013

I support scientific applications at The University of Manchester (see my LinkedIn profile if you’re interested in the details) and part of my job involves working on code written by researchers in a variety of languages.  When I say ‘variety’ I really mean it – MATLAB, Mathematica, Python, C, Fortran, Julia, Maple, Visual Basic and PowerShell are some languages I’ve worked with this month for instance.

Having to juggle the semantics of so many languages in my head sometimes leads to momentary confusion when working on someone’s program.  For example, I’ve been doing a lot of Python work recently but this morning I was hacking away on someone’s MATLAB code.  Buried deep within the program, it would have been very sensible to be able to do the equivalent of this:

a=rand(3,3)

a =
0.8147    0.9134    0.2785
0.9058    0.6324    0.5469
0.1270    0.0975    0.9575

>> [x,y,z]=a(:,1)

Indexing cannot yield multiple results.

That is, I want to be able to take the first column of the matrix a and broadcast it out to the variables x,y and z. The code I’m working on uses MUCH bigger matrices and this kind of assignment is occasionally useful since the variable names x,y,z have slightly more meaning than a(1,3), a(2,3), a(3,3).

The only concise way I’ve been able to do something like this using native MATLAB commands is to first convert to a cell. In MATLAB 2013a for instance:

>> temp=num2cell(a(:,1));
>> [x y z] = temp{:}

x =
0.8147

y =
0.9058

z =
0.1270

This works but I think it looks ugly and introduces conversion overheads. The problem I had for a short time is that I subconsciously expected multiple assignment to ‘Just Work’ in MATLAB since the concept makes sense in several other languages I use regularly.

from pylab import rand
a=rand(3,3)
[a,b,c]=a[:,0]
a = RandomReal[1, {3, 3}]
{x,y,z}=a[[All,1]]
a=rand(3,3);
(x,y,z)=a[:,1]

I’ll admit that I don’t often need this construct in MATLAB but it would definitely be occasionally useful. I wonder what other opinions there are out there? Do you think multiple assignment is useful (in any language)?

## Numpy, MATLAB and singular matrices

September 16th, 2013

Last week I gave a live demo of the IPython notebook to a group of numerical analysts and one of the computations we attempted to do was to solve the following linear system using Numpy’s solve command.

Now, the matrix shown above is singular and so we expect that we might have problems. Before looking at how Numpy deals with this computation, lets take a look at what happens if you ask MATLAB to do it

>> A=[1 2 3;4 5 6;7 8 9];
>> b=[15;15;15];
>> x=A\b
Warning: Matrix is close to singular
or badly scaled. Results may be
inaccurate. RCOND =  1.541976e-18.
x =
-39.0000
63.0000
-24.0000

MATLAB gives us a warning that the input matrix is close to being singular (note that it didn’t actually recognize that it is singular) along with an estimate of the reciprocal of the condition number. It tells us that the results may be inaccurate and we’d do well to check. So, lets check:

>> A*x

ans =
15.0000
15.0000
15.0000

>> norm(A*x-b)

ans =
2.8422e-14

We seem to have dodged the bullet since, despite the singular nature of our matrix, MATLAB has able to find a valid solution. MATLAB was right to have warned us though…in other cases we might not have been so lucky.

Let’s see how Numpy deals with this using the IPython notebook:

In [1]:
import numpy
from numpy import array
from numpy.linalg import solve
A=array([[1,2,3],[4,5,6],[7,8,9]])
b=array([15,15,15])
solve(A,b)

Out[1]:

array([-39.,  63., -24.])

It gave the same result as MATLAB [See note 1], presumably because it’s using the exact same LAPACK routine, but there was no warning of the singular nature of the matrix.  During my demo, it was generally felt by everyone in the room that a warning should have been given, particularly when working in an interactive setting.

If you look at the documentation for Numpy’s solve command you’ll see that it is supposed to throw an exception when the matrix is singular but it clearly didn’t do so here. The exception is sometimes thrown though:

In [4]:

C=array([[1,1,1],[1,1,1],[1,1,1]])
x=solve(C,b)

---------------------------------------------------------------------------
LinAlgError                               Traceback (most recent call last)
in ()
1 C=array([[1,1,1],[1,1,1],[1,1,1]])
----> 2 x=solve(C,b)

C:\Python32\lib\site-packages\numpy\linalg\linalg.py in solve(a, b)
326     results = lapack_routine(n_eq, n_rhs, a, n_eq, pivots, b, n_eq, 0)
327     if results['info'] > 0:
--> 328         raise LinAlgError('Singular matrix')
329     if one_eq:
330         return wrap(b.ravel().astype(result_t))

LinAlgError: Singular matrix

It seems that Numpy is somehow checking for exact singularity but this will rarely be detected due to rounding errors. Those I’ve spoken to consider that MATLAB’s approach of estimating the condition number and warning when that is high would be better behavior since it alerts the user to the fact that the matrix is badly conditioned.

Thanks to Nick Higham and David Silvester for useful discussions regarding this post.

Notes
[1] – The results really are identical which you can see by rerunning the calculation after evaluating format long in MATLAB and numpy.set_printoptions(precision=15) in Python

## Generate a vector of powers more quickly using cumprod in MATLAB

August 5th, 2013

While working on someone’s MATLAB code today there came a point when it was necessary to generate a vector of powers.  For example, [a a^2 a^3....a^10000] where a=0.999

a=0.9999;
y=a.^(1:10000);

This isn’t the only way one could form such a vector and I was curious whether or not an alternative method might be faster. On my current machine we have:

>> tic;y=a.^(1:10000);toc
Elapsed time is 0.001526 seconds.
>> tic;y=a.^(1:10000);toc
Elapsed time is 0.001529 seconds.
>> tic;y=a.^(1:10000);toc
Elapsed time is 0.001716 seconds.

Let’s look at the last result in the vector y

>> y(end)
ans =
0.367861046432970

So, 0.0015-ish seconds to beat.

>> tic;y1=cumprod(ones(1,10000)*a);toc
Elapsed time is 0.000075 seconds.
>> tic;y1=cumprod(ones(1,10000)*a);toc
Elapsed time is 0.000075 seconds.
>> tic;y1=cumprod(ones(1,10000)*a);toc
Elapsed time is 0.000075 seconds.

soundly beaten! More than a factor of 20 in fact. Let’s check out that last result

>> y1(end)
ans =
0.367861046432969

Only a difference in the 15th decimal place–I’m happy with that. What I’m wondering now, however, is will my faster method ever cause me grief?

This is only an academic exercise since this is not exactly a hot spot in the code!

## Software for the teaching of (and tinkering with) Control Theory and Control Systems

May 3rd, 2013

I recently picked up a few control theory books from the University library to support a project I am involved with right now and was interested in the seemingly total dominance of MATLAB in this subject area.  Since I’m not an expert in control systems, I’m not sure if this is because MATLAB is genuinely the best tool for the job or if it’s simply because it’s been around for a very long time and so has become entrenched.  Comments from anyone who works in relevant fields would be most welcome.

On its own, MATLAB is insufficient to teach introductory control systems courses — you also need the control systems toolbox as a bare minimum but most books and courses also seem to require Simulink and the symbolic math toolbox.  All of these are included in the student edition of MATLAB which is very reasonably priced.

If you are not a registered student, however, and don’t work for someone who can provide you with MATLAB it’s going to be very expensive!  As far as I can tell, your only option would be to purchase commercial licenses which are very expensive (as in thousands of dollars/pounds for MATLAB and a few toolboxes).

What else is out there?

I have a strong interest in mathematical software and so I know that there are several products that have support for control theory. Here are some that I know of and have access to myself

• Mathematica – Its symbolic math support far exceeds that of MATLAB and it is on an equal footing numerically but its control systems support is much more recent and I don’t know of a textbook that utilizes it.  One benefit of Mathematica is that it doesn’t separate functionality out into toolboxes – everything is just built in.  Another benefit to tinkerers is the home edition which gives you the full product at a much lower price than commercial licenses.
• Maple – This also has very strong symbolic and numeric math support.  It also comes with some Control Systems support built in.  Like Mathematica, it has a home edition for non-commercial tinkering and learning.
• Labview - A graphical programming language that I’m only just starting to get used to.  It has lots of users and advocates in my employers electrical and mechanical engineering departments.  There is no support for symbolic computing as far as I know.
• Python – Python is a superb general purpose scripting language that’s also completely free.  Numerics are taken care of by Numpy, symbolics by Sympy and there is a control theory module, the development of which is coordinated by Richard Murray of Caltech (The same Richard Murray that co-wrote the book Feedback Systems: An Introduction for Scientists and Engineers).
• Octave – Octave is a free implementation of the MATLAB .m language.  It also has a free control package.
• Scilab – Scilab is a free numerical environment that also has a free control package.

I haven’t mentioned Simulink alternatives in this post since I’ve discussed them before.

Questions

Some questions that arise are

• Are there any other alternatives to those listed above?
• Do these alternatives have sufficient functionality to support undergraduate courses in control systems and control theory?
• What would be the best language to use if you were teaching control systems as a Massively Open Online Course (MOOC)?
• Does it matter to employers which computational language you learned your control systems in as an undergraduate?

I find that the final point is very divisive among people I discuss it with.  On the one hand you have those who say ‘It’s the concepts that matter, the language you choose to implement them in is much less important’ and on the other hand you have those who say ‘It’s gotta be MATLAB, my father used MATLAB and his grandfather before him. Industry uses MATLAB, I only know MATLAB, we must teach MATLAB.’

## Matrix multiplication speed-up trick on MATLAB

April 23rd, 2013

I was recently working on some MATLAB code with Manchester University’s David McCormick.  Buried deep within this code was a function that was called many,many times…taking up a significant amount of overall run time.  We managed to speed up an important part of this function by almost a factor of two (on his machine) simply by inserting two brackets….a new personal record in overall application performance improvement per number of keystrokes.

The code in question is hugely complex, but the trick we used is really very simple.  Consider the following MATLAB code

>> a=rand(4000);
>> c=12.3;
>> tic;res1=c*a*a';toc
Elapsed time is 1.472930 seconds.

With the insertion of just two brackets, this runs quite a bit faster on my Ivy Bridge quad-core desktop.

>> tic;res2=c*(a*a');toc
Elapsed time is 0.907086 seconds.

So, what’s going on? Well, we think that in the first version of the code, MATLAB first calculates c*a to form a temporary matrix (let’s call it temp here) and then goes on to find temp*a’.  However, in the second version, we think that MATLAB calculates a*a’ first and in doing so it takes advantage of the fact that the result of multiplying a matrix by its transpose will be symmetric which is where we get the speedup.

Another demonstration of this phenomena can be seen as follows

>> a=rand(4000);
>> b=rand(4000);
>> tic;a*a';toc
Elapsed time is 0.887524 seconds.
>> tic;a*b;toc
Elapsed time is 1.473208 seconds.
>> tic;b*b';toc
Elapsed time is 0.966085 seconds.

Note that the symmetric matrix-matrix multiplications are faster than the general, non-symmetric one.

## Faster GPU Random Number Generators in MATLAB 2012b

March 10th, 2013

Ever since I took a look at GPU accelerating simple Monte Carlo Simulations using MATLAB, I’ve been disappointed with the performance of its GPU random number generator. In MATLAB 2012a, for example, it’s not much faster than the CPU implementation on my GPU hardware.  Consider the following code

function gpuRandTest2012a(n)

mydev=gpuDevice();
disp('CPU - Mersenne Twister');
tic
CPU = rand(n);
toc

sg = parallel.gpu.RandStream('mrg32k3a','Seed',1);
parallel.gpu.RandStream.setGlobalStream(sg);
disp('GPU - mrg32k3a');
tic
Rg = parallel.gpu.GPUArray.rand(n);
wait(mydev);
toc

Running this on MATLAB 2012a on my laptop gives me the following typical times (If you try this out yourself, the first run will always be slower for various reasons I’ll not go into here)

>> gpuRandTest2012a(10000)
CPU - Mersenne Twister
Elapsed time is 1.330505 seconds.
GPU - mrg32k3a
Elapsed time is 1.006842 seconds.

Running the same code on MATLAB 2012b, however, gives a very pleasant surprise with typical run times looking like this

CPU - Mersenne Twister
Elapsed time is 1.590764 seconds.
GPU - mrg32k3a
Elapsed time is 0.185686 seconds.

So, generation of random numbers using the GPU is now over 7 times faster than CPU generation on my laptop hardware–a significant improvment on the previous implementation.

New generators in 2012b

The MATLAB developers went a little further in 2012b though.  Not only have they significantly improved performance of the mrg32k3a combined multiple recursive generator, they have also implemented two new GPU random number generators based on the Random123 library.  Here are the timings for the generation of 100 million random numbers in MATLAB 2012b

CPU - Mersenne Twister
Elapsed time is 1.370252 seconds.
GPU - mrg32k3a
Elapsed time is 0.186152 seconds.
GPU - Threefry4x64-20
Elapsed time is 0.145144 seconds.
GPU - Philox4x32-10
Elapsed time is 0.129030 seconds.

Bear in mind that I am running this on the relatively weak GPU of my laptop!  If anyone runs it on something stronger, I’d love to hear of your results.

• Laptop model: Dell XPS L702X
• CPU: Intel Core i7-2630QM @2Ghz software overclockable to 2.9Ghz. 4 physical cores but total 8 virtual cores due to Hyperthreading.
• GPU: GeForce GT 555M with 144 CUDA Cores.  Graphics clock: 590Mhz.  Processor Clock:1180 Mhz. 3072 Mb DDR3 Memeory
• RAM: 8 Gb
• OS: Windows 7 Home Premium 64 bit.
• MATLAB: 2012a/2012b

## Fun with Mathematica’s Reduce and floating point arithmetic

February 8th, 2013

I was at a seminar yesterday where we were playing with Mathematica and wanted to solve the following equation

1.0609^t == 1.5

You punch this into Mathematica as follows:

Solve[1.0609^t == 1.5]

Mathematica returns the following

During evaluation of In[1]:= Solve::ifun: Inverse functions are being used by Solve,
so some solutions may not be found; use Reduce for complete solution information. >>

Out[1]= {{t -> 6.85862}}

I have got the solution I expect but Mathematica suggests that I’ll get more information if I use Reduce. So, lets do that.

In[2]:= Reduce[1.0609^t == 1.5, t]
During evaluation of In[2]:= Reduce::ratnz: Reduce was unable to solve the system with inexact
coefficients. The answer was obtained by solving a corresponding exact system and numericizing the result. >>
Out[20]= C[1] \[Element] Integers &&
t == -16.9154 (-0.405465 + (0. + 6.28319 I) C[1])

Looks complex and a little complicated! To understand why complex numbers have appeared in the mix you need to know that Mathematica always considers variables to be complex unless you tell it otherwise. So, it has given you the infinite number of complex values of t that would satisfy the original equation. No problem, let’s just tell Mathematica that we are only interested in real values of p.

In[3]:= Reduce[1.0609^t == 1.5, t, Reals]

During evaluation of In[3]:= Reduce::ratnz: Reduce was unable to solve the system with inexact
coefficients. The answer was obtained by solving a corresponding exact system and numericizing the result. >>

Out[3]= t == 6.85862

Again, I get the solution I expect. However, Mathematica still feels that it is necessary to give me a warning message. Time was ticking on so I posed the question on Mathematica Stack Exchange and we moved on.

http://mathematica.stackexchange.com/questions/19235/why-does-mathematica-struggle-with-solving-this-equation

At the time of writing, the lead answer says that ‘Mathematica is not “struggling” with your equation. The message is simply FYI — to tell you that, for this equation, it prefers to work with exact quantities rather than inexact quantities (reals)’

I’d accept this as an explanation except for one thing; the message say’s that it is UNABLE to solve the equation as I originally posed it and so needs to proceed by solving a corresponding exact system. That implies that it has struggled a great deal, given up and tried something else.

This would come as a surprise to anyone with a calculator who would simply do the following manipulations

1.0609^t == 1.5

Log[1.0609^t] == Log[1.5]

T*Log[1.0609] == Log[1.5]

T= Log[1.5]/Log[1.0609]

Mathematica evalutes this as

T=6.8586187084788275

This appears to solve the equation exactly.  Plug it back into Mathematica (or my calculator) to get

In[4]:= 1.0609^(6.8586187084788275)

Out[4]= 1.5

I had no trouble dealing with inexact quantities, and I didn’t need to ‘solve a corresponding exact system and numericize the result’.  This appears to be a simple problem. So, why does Mathematica bang on about it so much?

Over to MATLAB for a while

Mathematica is complaining that we have asked it to work with inexact quantities.  How could this be? 1.0609, 6.8586187084788275 and 1.5 look pretty exact to me! However, it turns out that as far as the computer is concerned some of these numbers are far from exact.

When you input a number such as 1.0609 into Mathematica, it considers it to be a double precision number and 1.0609 cannot be exactly represented as such.  The closest Mathematica, or indeed any numerical system that uses 64bit IEEE arithmetic, can get is 4777868844677359/4503599627370496 which evaluates to 1.0608999999999999541699935434735380113124847412109375. I wondered if this is why Mathematica was complaining about my problem.

At this point, I switch tools and use MATLAB for a while in order to investigate further.  I do this for no reason other than I know the related functions in MATLAB a little better.  MATLAB’s sym command can give us the rational number that exactly represents what is actually stored in memory (Thanks to Nick Higham for pointing this out to me).

>> sym(1.0609,'f')

ans =
4777868844677359/4503599627370496

We can then evaluate this fraction to whatever precision we like using the vpa command:

>> vpa(sym(1.0609,'f'),100)
ans =
1.0608999999999999541699935434735380113124847412109375

>> vpa(sym(1.5,'f'),100)
ans =
1.5

>> vpa(sym( 6.8586187084788275,'f'),100)
ans =
6.8586187084788274859192824806086719036102294921875

So, 1.5 can be represented exactly in 64bit double precision arithmetic but 1.0609 and 6.8586187075 cannot.  Mathematica is unhappy with this state of affairs and so chooses to solve an exact problem instead.  I guess if I am working in a system that cannot even represent the numbers in my problem (e.g. 1.0609) how can I expect to solve it?

Which Exact Problem?
So, which exact equation might Reduce be choosing to solve?  It could solve the equation that I mean:

(10609/10000)^t == 1500/1000

which does have an exact solution and so Reduce can find it.

(Log[2] - Log[3])/(2*(2*Log[2] + 2*Log[5] - Log[103]))

Evaluating this gives 6.858618708478698:

(Log[2] - Log[3])/(2*(2*Log[2] + 2*Log[5] - Log[103])) // N // FullForm

6.858618708478698

Alternatively, Mathematica could convert the double precision number 1.0609 to the fraction that exactly represents what’s actually in memory and solve that.

(4777868844677359/4503599627370496)^t == 1500/1000

This also has an exact solution:

(Log[2] - Log[3])/(52*Log[2] - Log[643] - Log[2833] - Log[18251] - Log[143711])

which evaluates to 6.858618708478904:

(Log[2] - Log[3])/(52*Log[2] - Log[643] - Log[2833] - Log[18251] - Log[143711]) // N // FullForm

6.858618708478904

Let’s take a look at the exact number Reduce is giving:

Quiet@Reduce[1.0609^t == 1.5, t, Reals] // N // FullForm

Equal[t, 6.858618708478698]

So, it looks like Mathematica is solving the equation I meant to solve and evaluating this solution it at the end.

Summary of solutions
Here I summarize the solutions I’ve found for this problem:

• 6.8586187084788275 – Pen,Paper+Mathematica for final evaluation.
• 6.858618708478698 – Mathematica solving the exact problem I mean and evaluating to double precision at the end.
• 6.858618708478904 – Mathematica solving the exact problem derived from what I really asked it and evaluating at the end.

What I find fun is that my simple minded pen and paper solution seems to satisfy the original equation better than the solutions arrived at by more complicated means.  Using MATLAB’s vpa again I plug in the three solutions above, evaluate to 50 digits and see which one is closest to 1.5:

>> vpa('1.5 - 1.0609^6.8586187084788275',50)
ans =
-0.00000000000000045535780896732093552784442911868195148156712149736
>> vpa('1.5 - 1.0609^6.858618708478698',50)
ans =
0.000000000000011028236861872639054542278886208515813177349441555
>> vpa('1.5 - 1.0609^6.858618708478904',50)
ans =
-0.0000000000000072391029234017787617507985059241243968693347431197

So, ‘my’ solution is better. Of course, this advantage goes away if I evaluate (Log[2] – Log[3])/(2*(2*Log[2] + 2*Log[5] – Log[103])) to 50 decimal places and plug that in.

>> sol=vpa('(log(2) - log(3))/(2*(2*log(2) + 2*log(5) - log(103)))',50)
sol =
6.858618708478822364949699852597847078441119153527
>> vpa(1.5 - 1.0609^sol',50)
ans =
0.0000000000000000000000000000000000000014693679385278593849609206715278070972733319459651


## Explorations in overriding MATLAB functions

January 30th, 2013

In a recent blog post, I demonstrated how to use the MATLAB 2012a Symbolic Toolbox to perform Variable precision QR decomposition in MATLAB.  The result was a function called vpa_qr which did the necessary work.

>> a=vpa([2 1 3;-1 0 7; 0 -1 -1]);
>> [Q R]=vpa_qr(a);

I’ve suppressed the output because it’s so large but it definitely works. When I triumphantly presented this function to the user who requested it he was almost completely happy.  What he really wanted, however, was for this to work:

>> a=vpa([2 1 3;-1 0 7; 0 -1 -1]);
>> [Q R]=qr(a);

In other words he wants to override the qr function such that it accepts variable precision types. MATLAB 2012a does not allow this:

>> a=vpa([2 1 3;-1 0 7; 0 -1 -1]);
>> [Q R]=qr(a)
Undefined function 'qr' for input arguments of type 'sym'.

I put something together that did the job for him but felt that it was unsatisfactory.  So, I sent my code to The MathWorks and asked them if what I had done was sensible and if there were any better options.  A MathWorks engineer called Hugo Carr sent me such a great, detailed reply that I asked if I could write it up as a blog post.  Here is the result:

Approach 1:  Define a new qr function, with a different name (such as vpa_qr).  This is probably the safest and simplest option and was the method I used in the original blog post.

• Pros: The new function will not interfere with your MATLAB namespace
• Cons: MATLAB will only use this function if you explicitly define that you wish to use it in a given function.  You would have to find all prior references to the qr algorithm and make a decision about which to use.

Approach 2: Define a new qr function and use the ‘isa’ function to catch instances of ‘sym’. This is the approach I took in the code I sent to The MathWorks.

function varargout = qr( varargin )

if nargin == 1 && isa( varargin{1}, 'sym' )
[varargout{1:nargout}] = vpa_qr( varargin{:} );
else
[varargout{1:nargout}] = builtin( 'qr', varargin{:} );
end
• Pros: qr will always select the correct code when executed on sym objects
• Cons: This code only works for shadowing built-ins and will produce a warning reminding you of this fact. If you wish to extend this pattern for other class types, you’ll require a switch statement (or nested if-then-else block), which could lead to a complex comparison each time qr is invoked (and subsequent performance hit). Note that switch statements in conjunction with calls to ‘isa’ are usually indicators that an object oriented approach is a better way forward.

Approach 3: The MathWorks do not recommend that you modify your MATLAB install. However for completeness, it is possible to add a new ‘method’ to the sym class by dropping your function into the sym class folder.  For MATLAB 2012a on Windows, this folder is at

C:\Program Files\MATLAB\R2012a\toolbox\symbolic\symbolic\@sym

For the sake of illustration, here is a simplified implementation. Call it qr.m

function result = qr( this )
result = feval(symengine,'linalg::factorQR', this);
end

Pros: Functions saved to a class folder take precedence over built in functionality, which means that MATLAB will always use your qr method for sym objects.

Cons: If you share code which uses this functionality, it won’t run on someone’s computer unless they update their sym class folder with your qr code. Additionally, if a new method is added to a class it may shadow the behaviour of other MATLAB functionality and lead to unexpected behaviour in Symbolic Toolbox.

Approach 4: For more of an object-oriented approach it is possible to sub-class the sym class, and add a new qr method.

classdef mySym < sym

methods
function this = mySym(arg)
this = this@sym(arg);
end

function result = qr( this )
result = feval(symengine,'linalg::factorQR', this);
end
end

end

Pros: Your change can be shipped with your code and it will work on a client’s computer without having to change the sym class.

Cons: When calling superclass methods on your mySym objects (such as sin(mySym1)), the result will be returned as the superclass unless you explicitly redefine the method to return the subclass.

N.B. There is a lot of literature which discusses why inheritance (subclassing) to augment a class’s behaviour is a bad idea. For example, if Symbolic Toolbox developers decide to add their own qr method to the sym API, overriding that function with your own code could break the system. You would need to update your subclass every time the superclass is updated. This violates encapsulation, as the subclass implementation depends on the superclass. You can avoid problems like these by using composition instead of inheritance.

Approach 5: You can create a new sym class by using composition, but it takes a little longer than the other approaches. Essentially, this involves creating a wrapper which provides the functionality of the original class, as well as any new functions you are interested in.

classdef mySymComp

properties
SymProp
end

methods
function this = mySymComp(symInput)
this.SymProp = symInput;
end

function result = qr( this )
result = feval(symengine,'linalg::factorQR', this.SymProp);
end
end

end

Note that in this example we did not add any of the original sym functions to the mySymComp class, however this can be done for as many as you like. For example, I might like to use the sin method from the original sym class, so I can just delegate to the methods of the sym object that I passed in during construction:

classdef mySymComp

properties
SymProp
end

methods
function this = mySymComp(symInput)
this.SymProp = symInput;
end

function result = qr( this )
result = feval(symengine,'linalg::factorQR', this.SymProp);
end

function G = sin(this)
G = mySymComp(sin(this.SymProp));
end
end

end

Pros: The change is totally encapsulated, and cannot be broken save for a significant change to the sym api (for example, the MathWorks adding a qr method to sym would not break your code).

Cons: The wrapper can be time consuming to write, and the resulting object is not a ‘sym’, meaning that if you pass a mySymComp object ‘a’ into the following code:

isa(a, 'sym')`

MATLAB will return ‘false’ by default.