Search Results

December 11th, 2013

Back in 2008, I featured various mathematical Christmas cards generated from mathematical software.  Here are the original links (including full source code for each ‘card’) along with a few of the images. Contact me if you’d like to add something new.

MATLAB Christmas card

SAGE Christmas greetings

Fern Fractal Christmas card

December 6th, 2013

A question I get asked a lot is ‘How can I do nonlinear least squares curve fitting in X?’ where X might be MATLAB, Mathematica or a whole host of alternatives.  Since this is such a common query, I thought I’d write up how to do it for a very simple problem in several systems that I’m interested in

This is the MATLAB version. For other versions,see the list below

The problem

xdata = -2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9
ydata = 0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001

and you’d like to fit the function

 F(p_1,p_2,x) = p_1 \cos(p_2 x)+p_2 \sin(p_1 x)

using nonlinear least squares.  You’re starting guesses for the parameters are p1=1 and P2=0.2

For now, we are primarily interested in the following results:

  • The fit parameters
  • Sum of squared residuals

Future updates of these posts will show how to get other results such as confidence intervals. Let me know what you are most interested in.

How you proceed depends on which toolboxes you have. Contrary to popular belief, you don’t need the Curve Fitting toolbox to do curve fitting…particularly when the fit in question is as basic as this. Out of the 90+ toolboxes sold by The Mathworks, I’ve only been able to look through the subset I have access to so I may have missed some alternative solutions.

Pure MATLAB solution (No toolboxes)

In order to perform nonlinear least squares curve fitting, you need to minimise the squares of the residuals. This means you need a minimisation routine.  Basic MATLAB comes with the fminsearch function which is based on the Nelder-Mead simplex method.  For this particular problem, it works OK but will not be suitable for more complex fitting problems.  Here’s the code

format compact
format long
xdata = [-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9];
ydata = [0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001];

%Function to calculate the sum of residuals for a given p1 and p2
fun = @(p) sum((ydata - (p(1)*cos(p(2)*xdata)+p(2)*sin(p(1)*xdata))).^2);

%starting guess
pguess = [1,0.2];

%optimise
[p,fminres] = fminsearch(fun,pguess)

This gives the following results

p =
1.881831115804464 0.700242006994123
fminres =
0.053812720914713

All we get here are the parameters and the sum of squares of the residuals.  If you want more information such as 95% confidence intervals, you’ll have a lot more hand-coding to do. Although fminsearch works fine in this instance, it soon runs out of steam for more complex problems.

MATLAB with optimisation toolbox

With respect to this problem, the optimisation toolbox gives you two main advantages over pure MATLAB.  The first is that better optimisation routines are available so more complex problems (such as those with constraints) can be solved and in less time.  The second is the provision of the lsqcurvefit function which is specifically designed to solve curve fitting problems.  Here’s the code

format long
format compact
xdata = [-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9];
ydata = [0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001];

%Note that we don't have to explicitly calculate residuals
fun = @(p,xdata) p(1)*cos(p(2)*xdata)+p(2)*sin(p(1)*xdata);

%starting guess
pguess = [1,0.2];

[p,fminres] = lsqcurvefit(fun,pguess,xdata,ydata)

This gives the results

p =
   1.881848414551983   0.700229137656802
fminres =
   0.053812696487326

MATLAB with statistics toolbox

There are two interfaces I know of in the stats toolbox and both of them give a lot of information about the fit. The problem set up is the same in both cases

%set up for both fit commands in the stats toolbox
xdata = [-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9];
ydata = [0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001];

fun = @(p,xdata) p(1)*cos(p(2)*xdata)+p(2)*sin(p(1)*xdata);

pguess = [1,0.2];

Method 1 makes use of NonLinearModel.fit

mdl = NonLinearModel.fit(xdata,ydata,fun,pguess)

The returned object is a NonLinearModel object

class(mdl)
ans =
NonLinearModel

which contains all sorts of useful stuff

mdl = 
Nonlinear regression model:
    y ~ p1*cos(p2*xdata) + p2*sin(p1*xdata)

Estimated Coefficients:
          Estimate             SE                 
    p1      1.8818508110535      0.027430139389359
    p2    0.700229815076442    0.00915260662357553

          tStat               pValue              
    p1    68.6052223191956    2.26832562501304e-12
    p2    76.5060538352836    9.49546284187105e-13

Number of observations: 10, Error degrees of freedom: 8
Root Mean Squared Error: 0.082
R-Squared: 0.996,  Adjusted R-Squared 0.995
F-statistic vs. zero model: 1.43e+03, p-value = 6.04e-11

If you don’t need such heavyweight infrastructure, you can make use of the statistic toolbox’s nlinfit function

[p,R,J,CovB,MSE,ErrorModelInfo] = nlinfit(xdata,ydata,fun,pguess);

Along with our parameters (p) this also provides the residuals (R), Jacobian (J), Estimated variance-covariance matrix (CovB), Mean Squared Error (MSE) and a structure containing information about the error model fit (ErrorModelInfo).

Both nlinfit and NonLinearModel.fit use the same minimisation algorithm which is based on Levenberg-Marquardt

MATLAB with Symbolic Toolbox

MATLAB’s symbolic toolbox provides a completely separate computer algebra system called Mupad which can handle nonlinear least squares fitting via its stats::reg function.  Here’s how to solve our problem in this environment.

First, you need to start up the mupad application.  You can do this by entering

mupad

into MATLAB. Once you are in mupad, the code looks like this

xdata := [-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9]:
ydata := [0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001]:
stats::reg(xdata,ydata,p1*cos(p2*x)+p2*sin(p1*x),[x],[p1,p2],StartingValues=[1,0.2])

The result returned is

[[1.88185085, 0.7002298172], 0.05381269642]

These are our fitted parameters,p1 and p2, along with the sum of squared residuals. The documentation tells us that the optimisation algorithm is Levenberg-Marquardt– this is rather better than the simplex algorithm used by basic MATLAB’s fminsearch.

MATLAB with the NAG Toolbox

The NAG Toolbox for MATLAB is a commercial product offered by the UK based Numerical Algorithms Group.  Their main products are their C and Fortran libraries but they also have a comprehensive MATLAB toolbox that contains something like 1500+ functions.  My University has a site license for pretty much everything they put out and we make great use of it all.  One of the benefits of the NAG toolbox over those offered by The Mathworks is speed.  NAG is often (but not always) faster since its based on highly optimized, compiled Fortran.  One of the problems with the NAG toolbox is that it is difficult to use compared to Mathworks toolboxes.

In an earlier blog post, I showed how to create wrappers for the NAG toolbox to create an easy to use interface for basic nonlinear curve fitting.  Here’s how to solve our problem using those wrappers.

format long
format compact
xdata = [-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9];
ydata = [0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001];

%Note that we don't have to explicitly calculate residuals
fun = @(p,xdata) p(1)*cos(p(2)*xdata)+p(2)*sin(p(1)*xdata);
start = [1,0.2];
[p,fminres]=nag_lsqcurvefit(fun,start,xdata,ydata)

which gives

Warning: nag_opt_lsq_uncon_mod_func_easy (e04fy) returned a warning indicator (5) 
p =
   1.881850904268710   0.700229557886739
fminres =
   0.053812696425390

For convenience, here’s the two files you’ll need to run the above (you’ll also need the NAG Toolbox for MATLAB of course)

MATLAB with curve fitting toolbox

One would expect the curve fitting toolbox to be able to fit such a simple curve and one would be right :)

xdata = [-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9];
ydata = [0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001];

opt = fitoptions('Method','NonlinearLeastSquares',...
               'Startpoint',[1,0.2]);
f = fittype('p1*cos(p2*x)+p2*sin(p1*x)','options',opt);

fitobject = fit(xdata',ydata',f)

Note that, unlike every other Mathworks method shown here, xdata and ydata have to be column vectors. The result looks like this

fitobject = 

     General model:
     fitobject(x) = p1*cos(p2*x)+p2*sin(p1*x)
     Coefficients (with 95% confidence bounds):
       p1 =       1.882  (1.819, 1.945)
       p2 =      0.7002  (0.6791, 0.7213)

fitobject is of type cfit:

 class(fitobject)

ans =
cfit

In this case it contains two fields, p1 and p2, which are the parameters we are looking for

>> fieldnames(fitobject)
ans = 
    'p1'
    'p2'
>> fitobject.p1
ans =
   1.881848414551983
>> fitobject.p2
ans =
    0.700229137656802

For maximum information, call the fit command like this:

[fitobject,gof,output] = fit(xdata',ydata',f)

fitobject = 

     General model:
     fitobject(x) = p1*cos(p2*x)+p2*sin(p1*x)
     Coefficients (with 95% confidence bounds):
       p1 =       1.882  (1.819, 1.945)
       p2 =      0.7002  (0.6791, 0.7213)

gof = 

           sse: 0.053812696487326
       rsquare: 0.995722238905101
           dfe: 8
    adjrsquare: 0.995187518768239
          rmse: 0.082015773244637

output = 

           numobs: 10
         numparam: 2
        residuals: [10x1 double]
         Jacobian: [10x2 double]
         exitflag: 3
    firstorderopt: 3.582047395989108e-05
       iterations: 6
        funcCount: 21
     cgiterations: 0
        algorithm: 'trust-region-reflective'
          message: [1x86 char]
November 26th, 2013

As soon as I heard the news that Mathematica was being made available completely free on the Raspberry Pi, I just had to get myself a Pi and have a play.  So, I bought the Raspberry Pi Advanced Kit from my local Maplin Electronics store, plugged it to the kitchen telly and booted it up.  The exercise made me miss my father because the last time I plugged a computer into the kitchen telly was when I was 8 years old; it was Christmas morning and dad and I took our first steps into a new world with my Sinclair Spectrum 48K.

How to install Mathematica on the Raspberry Pi

Future raspberry pis wll have Mathematica installed by default but mine wasn’t new enough so I just typed the following at the command line

sudo apt-get update && sudo apt-get install wolfram-engine

On my machine, I was told

The following extra packages will be installed:
  oracle-java7-jdk
The following NEW packages will be installed:
  oracle-java7-jdk wolfram-engine
0 upgraded, 2 newly installed, 0 to remove and 1 not upgraded.
Need to get 278 MB of archives.
After this operation, 588 MB of additional disk space will be used.

So, it seems that Mathematica needs Oracle’s Java and that’s being installed for me as well. The combination of the two is going to use up 588MB of disk space which makes me glad that I have an 8Gb SD card in my pi.

Mathematica version 10!

On starting Mathematica on the pi, my first big surprise was the version number.  I am the administrator of an unlimited academic site license for Mathematica at The University of Manchester and the latest version we can get for our PCs at the time of writing is 9.0.1.  My free pi version is at version 10!  The first clue is the installation directory:

/opt/Wolfram/WolframEngine/10.0

and the next clue is given by evaluating $Version in Mathematica itself

In[2]:= $Version

Out[2]= "10.0 for Linux ARM (32-bit) (November 19, 2013)"

To get an idea of what’s new in 10, I evaluated the following command on Mathematica on the Pi

Export["PiFuncs.dat",Names["System`*"]]

This creates a PiFuncs.dat file which tells me the list of functions in the System context on the version of Mathematica on the pi. Transfer this over to my Windows PC and import into Mathematica 9.0.1 with

pifuncs = Flatten[Import["PiFuncs.dat"]];

Get the list of functions from version 9.0.1 on Windows:

winVer9funcs = Names["System`*"];

Finally, find out what’s in pifuncs but not winVer9funcs

In[16]:= Complement[pifuncs, winVer9funcs]

Out[16]= {"Activate", "AffineStateSpaceModel", "AllowIncomplete", \
"AlternatingFactorial", "AntihermitianMatrixQ", \
"AntisymmetricMatrixQ", "APIFunction", "ArcCurvature", "ARCHProcess", \
"ArcLength", "Association", "AsymptoticOutputTracker", \
"AutocorrelationTest", "BarcodeImage", "BarcodeRecognize", \
"BoxObject", "CalendarConvert", "CanonicalName", "CantorStaircase", \
"ChromaticityPlot", "ClassifierFunction", "Classify", \
"ClipPlanesStyle", "CloudConnect", "CloudDeploy", "CloudDisconnect", \
"CloudEvaluate", "CloudFunction", "CloudGet", "CloudObject", \
"CloudPut", "CloudSave", "ColorCoverage", "ColorDistance", "Combine", \
"CommonName", "CompositeQ", "Computed", "ConformImages", "ConformsQ", \
"ConicHullRegion", "ConicHullRegion3DBox", "ConicHullRegionBox", \
"ConstantImage", "CountBy", "CountedBy", "CreateUUID", \
"CurrencyConvert", "DataAssembly", "DatedUnit", "DateFormat", \
"DateObject", "DateObjectQ", "DefaultParameterType", \
"DefaultReturnType", "DefaultView", "DeviceClose", "DeviceConfigure", \
"DeviceDriverRepository", "DeviceExecute", "DeviceInformation", \
"DeviceInputStream", "DeviceObject", "DeviceOpen", "DeviceOpenQ", \
"DeviceOutputStream", "DeviceRead", "DeviceReadAsynchronous", \
"DeviceReadBuffer", "DeviceReadBufferAsynchronous", \
"DeviceReadTimeSeries", "Devices", "DeviceWrite", \
"DeviceWriteAsynchronous", "DeviceWriteBuffer", \
"DeviceWriteBufferAsynchronous", "DiagonalizableMatrixQ", \
"DirichletBeta", "DirichletEta", "DirichletLambda", "DSolveValue", \
"Entity", "EntityProperties", "EntityProperty", "EntityValue", \
"Enum", "EvaluationBox", "EventSeries", "ExcludedPhysicalQuantities", \
"ExportForm", "FareySequence", "FeedbackLinearize", "Fibonorial", \
"FileTemplate", "FileTemplateApply", "FindAllPaths", "FindDevices", \
"FindEdgeIndependentPaths", "FindFundamentalCycles", \
"FindHiddenMarkovStates", "FindSpanningTree", \
"FindVertexIndependentPaths", "Flattened", "ForeignKey", \
"FormatName", "FormFunction", "FormulaData", "FormulaLookup", \
"FractionalGaussianNoiseProcess", "FrenetSerretSystem", "FresnelF", \
"FresnelG", "FullInformationOutputRegulator", "FunctionDomain", \
"FunctionRange", "GARCHProcess", "GeoArrow", "GeoBackground", \
"GeoBoundaryBox", "GeoCircle", "GeodesicArrow", "GeodesicLine", \
"GeoDisk", "GeoElevationData", "GeoGraphics", "GeoGridLines", \
"GeoGridLinesStyle", "GeoLine", "GeoMarker", "GeoPoint", \
"GeoPolygon", "GeoProjection", "GeoRange", "GeoRangePadding", \
"GeoRectangle", "GeoRhumbLine", "GeoStyle", "Graph3D", "GroupBy", \
"GroupedBy", "GrowCutBinarize", "HalfLine", "HalfPlane", \
"HiddenMarkovProcess", "", "", "", "", "", "", \
"", "", "", "", "", "", "", "", "", "", \
"", "", "", "", "", "", "", "", "", "", \
"", "", "", "", "", "", "", "", "", "", \
"ï ¯", "ï \.b2", "ï \.b3", "IgnoringInactive", "ImageApplyIndexed", \
"ImageCollage", "ImageSaliencyFilter", "Inactivate", "Inactive", \
"IncludeAlphaChannel", "IncludeWindowTimes", "IndefiniteMatrixQ", \
"IndexedBy", "IndexType", "InduceType", "InferType", "InfiniteLine", \
"InfinitePlane", "InflationAdjust", "InflationMethod", \
"IntervalSlider", "ï ¨", "ï ¢", "ï ©", "ï ¤", "ï \[Degree]", "ï ­", \
"ï ¡", "ï «", "ï ®", "ï §", "ï £", "ï ¥", "ï \[PlusMinus]", \
"ï \[Not]", "JuliaSetIterationCount", "JuliaSetPlot", \
"JuliaSetPoints", "KEdgeConnectedGraphQ", "Key", "KeyDrop", \
"KeyExistsQ", "KeyIntersection", "Keys", "KeySelect", "KeySort", \
"KeySortBy", "KeyTake", "KeyUnion", "KillProcess", \
"KVertexConnectedGraphQ", "LABColor", "LinearGradientImage", \
"LinearizingTransformationData", "ListType", "LocalAdaptiveBinarize", \
"LocalizeDefinitions", "LogisticSigmoid", "Lookup", "LUVColor", \
"MandelbrotSetIterationCount", "MandelbrotSetMemberQ", \
"MandelbrotSetPlot", "MinColorDistance", "MinimumTimeIncrement", \
"MinIntervalSize", "MinkowskiQuestionMark", "MovingMap", \
"NegativeDefiniteMatrixQ", "NegativeSemidefiniteMatrixQ", \
"NonlinearStateSpaceModel", "Normalized", "NormalizeType", \
"NormalMatrixQ", "NotebookTemplate", "NumberLinePlot", "OperableQ", \
"OrthogonalMatrixQ", "OverwriteTarget", "PartSpecification", \
"PlotRangeClipPlanesStyle", "PositionIndex", \
"PositiveSemidefiniteMatrixQ", "Predict", "PredictorFunction", \
"PrimitiveRootList", "ProcessConnection", "ProcessInformation", \
"ProcessObject", "ProcessStatus", "Qualifiers", "QuantityVariable", \
"QuantityVariableCanonicalUnit", "QuantityVariableDimensions", \
"QuantityVariableIdentifier", "QuantityVariablePhysicalQuantity", \
"RadialGradientImage", "RandomColor", "RegularlySampledQ", \
"RemoveBackground", "RequiredPhysicalQuantities", "ResamplingMethod", \
"RiemannXi", "RSolveValue", "RunProcess", "SavitzkyGolayMatrix", \
"ScalarType", "ScorerGi", "ScorerGiPrime", "ScorerHi", \
"ScorerHiPrime", "ScriptForm", "Selected", "SendMessage", \
"ServiceConnect", "ServiceDisconnect", "ServiceExecute", \
"ServiceObject", "ShowWhitePoint", "SourceEntityType", \
"SquareMatrixQ", "Stacked", "StartDeviceHandler", "StartProcess", \
"StateTransformationLinearize", "StringTemplate", "StructType", \
"SystemGet", "SystemsModelMerge", "SystemsModelVectorRelativeOrder", \
"TemplateApply", "TemplateBlock", "TemplateExpression", "TemplateIf", \
"TemplateObject", "TemplateSequence", "TemplateSlot", "TemplateWith", \
"TemporalRegularity", "ThermodynamicData", "ThreadDepth", \
"TimeObject", "TimeSeries", "TimeSeriesAggregate", \
"TimeSeriesInsert", "TimeSeriesMap", "TimeSeriesMapThread", \
"TimeSeriesModel", "TimeSeriesModelFit", "TimeSeriesResample", \
"TimeSeriesRescale", "TimeSeriesShift", "TimeSeriesThread", \
"TimeSeriesWindow", "TimeZoneConvert", "TouchPosition", \
"TransformedProcess", "TrapSelection", "TupleType", "TypeChecksQ", \
"TypeName", "TypeQ", "UnitaryMatrixQ", "URLBuild", "URLDecode", \
"URLEncode", "URLExistsQ", "URLExpand", "URLParse", "URLQueryDecode", \
"URLQueryEncode", "URLShorten", "ValidTypeQ", "ValueDimensions", \
"Values", "WhiteNoiseProcess", "XMLTemplate", "XYZColor", \
"ZoomLevel", "$CloudBase", "$CloudConnected", "$CloudDirectory", \
"$CloudEvaluation", "$CloudRootDirectory", "$EvaluationEnvironment", \
"$GeoLocationCity", "$GeoLocationCountry", "$GeoLocationPrecision", \
"$GeoLocationSource", "$RegisteredDeviceClasses", \
"$RequesterAddress", "$RequesterWolframID", "$RequesterWolframUUID", \
"$UserAgentLanguages", "$UserAgentMachine", "$UserAgentName", \
"$UserAgentOperatingSystem", "$UserAgentString", "$UserAgentVersion", \
"$WolframID", "$WolframUUID"}

There we have it, a preview of the list of functions that might be coming in desktop version 10 of Mathematica courtesy of the free Pi version.

No local documentation

On a desktop version of Mathematica, all of the Mathematica documentation is available on your local machine by clicking on Help->Documentation Center in the Mathematica notebook interface.  On the pi version, it seems that there is no local documentation, presumably to keep the installation size down.  You get to the documentation via the notebook interface by clicking on Help->OnlineDocumentation which takes you to http://reference.wolfram.com/language/?src=raspi

Speed vs my laptop

I am used to running Mathematica on high specification machines and so naturally the pi version felt very sluggish–particularly when using the notebook interface.  With that said, however,  I found it very usable for general playing around.  I was very curious, however, about the speed of the pi version compared to the version on my home laptop and so created a small benchmark notebook that did three things:

  • Calculate pi to 1,000,000 decimal places.
  • Multiply two 1000 x 1000 random matrices together
  • Integrate sin(x)^2*tan(x) with respect to x

The comparison is going to be against my Windows 7 laptop which has a quad-core Intel Core i7-2630QM. The procedure I followed was:

  • Start a fresh version of the Mathematica notebook and open pi_bench.nb
  • Click on Evaluation->Evaluate Notebook and record the times
  • Click on Evaluation->Evaluate Notebook again and record the new times.

Note that I use the AbsoluteTiming function instead of Timing (detailed reason given here) and I clear the system cache (detailed resason given here).   You can download the notebook I used here.  Alternatively, copy and paste the code below

(*Clear Cache*)
ClearSystemCache[]

(*Calculate pi to 1 million decimal places and store the result*)
AbsoluteTiming[pi = N[Pi, 1000000];]

(*Multiply two random 1000x1000 matrices together and store the \
result*)
a = RandomReal[1, {1000, 1000}];
b = RandomReal[1, {1000, 1000}];

AbsoluteTiming[prod = Dot[a, b];]

(*calculate an integral and store the result*)
AbsoluteTiming[res = Integrate[Sin[x]^2*Tan[x], x];]

Here are the results. All timings in seconds.

Test Laptop Run 1 Laptop Run 2 RaspPi Run 1 RaspPi Run 2 Best Pi/Best Laptop
Million digits of Pi 0.994057 1.007058 14.101360 13.860549 13.9434
Matrix product 0.108006 0.074004 85.076986 85.526180 1149.63
Symbolic integral 0.035002 0.008000 0.980086 0.448804 56.1

From these tests, we see that Mathematica on the pi is around 14 to 1149 times slower on the pi than my laptop. The huge difference between the pi and laptop for the matrix product stems from the fact that ,on the laptop, Mathematica is using Intels Math Kernel Library (MKL).  The MKL is extremely well optimised for Intel processors and will be using all 4 of the laptop’s CPU cores along with extra tricks such as AVX operations etc.  I am not sure what is being used on the pi for this operation.

I also ran the standard BenchMarkReport[] on the Raspberry Pi.  The results are available here.

Speed vs Python

Comparing Mathematica on the pi to Mathematica on my laptop might have been a fun exercise for me but it’s not really fair on the pi which wasn’t designed to perform against expensive laptops. So, let’s move on to a more meaningful speed comparison: Mathematica on pi versus Python on pi.

When it comes to benchmarking on Python, I usually turn to the timeit module.  This time, however, I’m not going to use it and that’s because of something odd that’s happening with sympy and caching.  I’m using sympy to calculate pi to 1 million places and for the symbolic calculus.  Check out this ipython session on the pi

pi@raspberrypi ~ $ SYMPY_USE_CACHE=no
pi@raspberrypi ~ $ ipython
Python 2.7.3 (default, Jan 13 2013, 11:20:46)
Type "copyright", "credits" or "license" for more information.

IPython 0.13.1 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: import sympy
In [2]: pi=sympy.pi.evalf(100000) #Takes a few seconds
In [3]: %timeit pi=sympy.pi.evalf(100000)
100 loops, best of 3: 2.35 ms per loop

In short, I have asked sympy not to use caching (I think!) and yet it is caching the result.  I don’t want to time how quickly sympy can get a result from the cache so I can’t use timeit until I figure out what’s going on here. Since I wanted to publish this post sooner rather than later I just did this:

import sympy
import time
import numpy

start = time.time()
pi=sympy.pi.evalf(1000000)
elapsed = (time.time() - start)
print('1 million pi digits: %f seconds' % elapsed)

a = numpy.random.uniform(0,1,(1000,1000))
b = numpy.random.uniform(0,1,(1000,1000))
start = time.time()
c=numpy.dot(a,b)
elapsed = (time.time() - start)
print('Matrix Multiply: %f seconds' % elapsed)

x=sympy.Symbol('x')
start = time.time()
res=sympy.integrate(sympy.sin(x)**2*sympy.tan(x),x)
elapsed = (time.time() - start)
print('Symbolic Integration: %f seconds' % elapsed)

Usually, I’d use time.clock() to measure things like this but something *very* strange is happening with time.clock() on my pi–something I’ll write up later.  In short, it didn’t work properly and so I had to resort to time.time().

Here are the results:

1 million pi digits: 5535.621769 seconds
Matrix Multiply: 77.938481 seconds
Symbolic Integration: 1654.666123 seconds

The result that really surprised me here was the symbolic integration since the problem I posed didn’t look very difficult. Sympy on pi was thousands of times slower than Mathematica on pi for this calculation!  On my laptop, the calculation times between Mathematica and sympy were about the same for this operation.

That Mathematica beats sympy for 1 million digits of pi doesn’t surprise me too much since I recall attending a seminar a few years ago where Wolfram Research described how they had optimized the living daylights out of that particular operation. Nice to see Python beating Mathematica by a little bit in the linear algebra though.

 

July 2nd, 2013

Welcome to my monthly round-up of news from the world of mathematical software.  As always, thanks to all contributors.  If you have any mathematical software news (releases, blog articles and son on), feel free to contact me.  For details on how to follow Walking Randomly now that Google Reader has died, see this post.

Numerics for .NET

Faster, faster, FASTER!

Sparsification

  • Version 1.0 of TxSSA has been released.  According to the project’s website, “TxSSA is an acronym for Tech-X Corporation Sparse Spectral Approximation. It is a library that has interfaces in C++, C, and MATLAB. It is an implementation of a matrix sparsification algorithm that takes a general real or complex matrix as input and produces a sparse output matrix of the same size. The non-zero entries are chosen to minimize changes to the singular values and singular vectors corresponding to the near null-space. The output matrix is constrained to preserve left and right null-spaces exactly. The sparsity pattern of the output matrix is automatically determined or can be given as input.”

Data and Statistics

  • Stata is a commercial statistics package that’s been around since 1985.  It’s now at version 13 and has a bucket load of new functions and improvements.
  • A new minor release of IDL (Interactive Data Language) from Exelis Visual Information Solutions has been released.  Details of version 8.2.3 can be found here.

Molten Mathematics

Some Freebies

  • SMath studio is a free Mathcad clone developed by Andrey Ivashov.  Recently, Andrey has been releasing nee versions of Smath much more regularly and it is now at version 0.96.4909.  To see what’s been added in June, see the forum posts here and here.
  • Rene Grothmann’s very frequently updated Euler Math Toolbox is now at version 22.8.  As always, Rene’s detailed version log tells you what’s new.

Safe and reliable numerical computing

  • Sollya 4.0 was released earlier this month and is available for download at: http://sollya.gforge.inria.fr/.  According to the developers, “Sollya is both a tool environment and a library for safe floating-point code development. It offers a convenient way to perform computations with multiple precision interval arithmetic. It is particularly targeted to the automatized implementation of mathematical floating-point libraries (libm).”
  • The INTLAB toolbox for MATLAB is now at version 7.1. INTLAB is the MATLAB toolbox for reliable computing and self-validating algorithms.

Pythonic Mathematics

  • Version 5.10 of Sage, the free Computer Algebra System based on Python, was released on 17th June.  Martin Albrect discusses some of his favourite new goodies over at his blog and the full list of new stuff is on the Sage changelog.
  • Pandas is Python’s main data analysis library and version 0.12 is out.  Take a look at the newness here.
April 3rd, 2013

Welcome to the latest edition of A Month of Math Software where I look back over the last month and report on all that is new and shiny in the world of mathematical software.  I’ve recently restarted work after the Easter break and so it seems fitting that I offer you all Easter Eggs courtesy of Lijia Yu and R.  Enjoy!

General purpose mathematical systems

MATLAB add-ons

  • The multiprecision MATLAB toolbox from Advanpix has been upgraded to version 3.4.3.3431 with the addition of multidimensional arrays.
  • The superb, free chebfun project has now been extended to 2 dimensions with the release of chebfun2.

GPU accelerated computation

Statistics and visualisation 

Finite elements

  • Version 7.3 of deal.II is now available.  deal.II is a C++ program library targeted at the computational solution of partial differential equations using adaptive finite elements.

 

March 25th, 2013

I’m working on a presentation involving Mathematica 9 at the moment and found myself wanting a gallery of all built-in plots using default settings.  Since I couldn’t find such a gallery, I made one myself.  The notebook is available here and includes 99 built-in plots, charts and gauges generated using default settings.  If you hover your mouse over one the plots in the Mathematica notebook, it will display a ToolTip showing usage notes for the function that generated it.

The gallery only includes functions that are fully integrated with Mathematica so doesn’t include things from add-on packages such as StatisticalPlots.

A screenshot of the gallery is below.  I haven’t made an in-browser interactive version due to size.

Mathematica 9 charts

March 4th, 2013

Welcome to the latest Month of Math Software here at WalkingRandomly.  If you have any mathematical software news or blogposts that you’d like to share with a larger audience, feel free to contact me.  Thanks to everyone who contributed news items this month, I couldn’t do it without you.

The NAG Library for Java

MATLAB-a-likes

  • Version 3.6.4 of Octave, the free, open-source MATLAB clone has been released.  This version contains some minor bug fixes.  To see everything that’s new since version 3.6, take a look at the NEWS file.  If you like MATLAB syntax but don’t like the price, Octave may well be for you.
  • The frequently updated Euler Math Toolbox is now at version 20.98 with a full list of changes in the log.  Scanning through the recent changes log, I came across the very nice iteratefunction which works as follows
    >iterate("cos(x)",1,100,till="abs(cos(x)-x)<0.001")
    
    [ 1  0.540302305868  0.857553215846  0.654289790498  0.793480358743
    0.701368773623  0.763959682901  0.722102425027  0.750417761764
    0.731404042423  0.744237354901  0.735604740436  0.74142508661
    0.737506890513  0.740147335568  0.738369204122  0.739567202212 ]

Mathematical and Scientific Python

  • The Python based computer algebra system, SAGE, has been updated to version 5.7.  The full list of changes is at http://www.sagemath.org/mirror/src/changelogs/sage-5.7.txt
  • Numpy is the fundamental Python package required for numerical computing with Python.  Numpy is now at version 1.7 and you can see what’s new by taking a look at the release notes

Spreadsheet news

R and stuff

This and that

  • The commercial computer algebra system, Magma, has seen another incremental update in version 2.19-3.
  • The NCAR Command Language was updated to version 6.1.2.
  • IDL was updated to version 8.2.2.  Since I’m currenty obsessed with random number generators, I’ll point out that in this release IDL finally moves away from an old Numerical Recipies generator and now uses the Mersenne Twister like almost everybody else.

From the blogs

February 8th, 2013

I was at a seminar yesterday where we were playing with Mathematica and wanted to solve the following equation

1.0609^t == 1.5

You punch this into Mathematica as follows:

Solve[1.0609^t == 1.5]

Mathematica returns the following

During evaluation of In[1]:= Solve::ifun: Inverse functions are being used by Solve, 
so some solutions may not be found; use Reduce for complete solution information. >>

Out[1]= {{t -> 6.85862}}

I have got the solution I expect but Mathematica suggests that I’ll get more information if I use Reduce. So, lets do that.

In[2]:= Reduce[1.0609^t == 1.5, t]
During evaluation of In[2]:= Reduce::ratnz: Reduce was unable to solve the system with inexact 
coefficients. The answer was obtained by solving a corresponding exact system and numericizing the result. >>
Out[20]= C[1] \[Element] Integers && 
  t == -16.9154 (-0.405465 + (0. + 6.28319 I) C[1])

Looks complex and a little complicated! To understand why complex numbers have appeared in the mix you need to know that Mathematica always considers variables to be complex unless you tell it otherwise. So, it has given you the infinite number of complex values of t that would satisfy the original equation. No problem, let’s just tell Mathematica that we are only interested in real values of p.

In[3]:= Reduce[1.0609^t == 1.5, t, Reals]

During evaluation of In[3]:= Reduce::ratnz: Reduce was unable to solve the system with inexact 
coefficients. The answer was obtained by solving a corresponding exact system and numericizing the result. >>

Out[3]= t == 6.85862

Again, I get the solution I expect. However, Mathematica still feels that it is necessary to give me a warning message. Time was ticking on so I posed the question on Mathematica Stack Exchange and we moved on.

http://mathematica.stackexchange.com/questions/19235/why-does-mathematica-struggle-with-solving-this-equation

At the time of writing, the lead answer says that ‘Mathematica is not “struggling” with your equation. The message is simply FYI — to tell you that, for this equation, it prefers to work with exact quantities rather than inexact quantities (reals)’

I’d accept this as an explanation except for one thing; the message say’s that it is UNABLE to solve the equation as I originally posed it and so needs to proceed by solving a corresponding exact system. That implies that it has struggled a great deal, given up and tried something else.

This would come as a surprise to anyone with a calculator who would simply do the following manipulations

1.0609^t == 1.5

Log[1.0609^t] == Log[1.5]

T*Log[1.0609] == Log[1.5]

T= Log[1.5]/Log[1.0609]

Mathematica evalutes this as

T=6.8586187084788275

This appears to solve the equation exactly.  Plug it back into Mathematica (or my calculator) to get

In[4]:= 1.0609^(6.8586187084788275)

Out[4]= 1.5

I had no trouble dealing with inexact quantities, and I didn’t need to ‘solve a corresponding exact system and numericize the result’.  This appears to be a simple problem. So, why does Mathematica bang on about it so much?

Over to MATLAB for a while

Mathematica is complaining that we have asked it to work with inexact quantities.  How could this be? 1.0609, 6.8586187084788275 and 1.5 look pretty exact to me! However, it turns out that as far as the computer is concerned some of these numbers are far from exact.

When you input a number such as 1.0609 into Mathematica, it considers it to be a double precision number and 1.0609 cannot be exactly represented as such.  The closest Mathematica, or indeed any numerical system that uses 64bit IEEE arithmetic, can get is 4777868844677359/4503599627370496 which evaluates to 1.0608999999999999541699935434735380113124847412109375. I wondered if this is why Mathematica was complaining about my problem.

At this point, I switch tools and use MATLAB for a while in order to investigate further.  I do this for no reason other than I know the related functions in MATLAB a little better.  MATLAB’s sym command can give us the rational number that exactly represents what is actually stored in memory (Thanks to Nick Higham for pointing this out to me).

>> sym(1.0609,'f')

ans =
4777868844677359/4503599627370496

We can then evaluate this fraction to whatever precision we like using the vpa command:

>> vpa(sym(1.0609,'f'),100)
ans =
1.0608999999999999541699935434735380113124847412109375

>> vpa(sym(1.5,'f'),100)
ans =
1.5

>> vpa(sym( 6.8586187084788275,'f'),100)
ans =
6.8586187084788274859192824806086719036102294921875

So, 1.5 can be represented exactly in 64bit double precision arithmetic but 1.0609 and 6.8586187075 cannot.  Mathematica is unhappy with this state of affairs and so chooses to solve an exact problem instead.  I guess if I am working in a system that cannot even represent the numbers in my problem (e.g. 1.0609) how can I expect to solve it?

Which Exact Problem?
So, which exact equation might Reduce be choosing to solve?  It could solve the equation that I mean:

(10609/10000)^t == 1500/1000

which does have an exact solution and so Reduce can find it.

(Log[2] - Log[3])/(2*(2*Log[2] + 2*Log[5] - Log[103]))

Evaluating this gives 6.858618708478698:

(Log[2] - Log[3])/(2*(2*Log[2] + 2*Log[5] - Log[103])) // N // FullForm

6.858618708478698`

Alternatively, Mathematica could convert the double precision number 1.0609 to the fraction that exactly represents what’s actually in memory and solve that.

(4777868844677359/4503599627370496)^t == 1500/1000

This also has an exact solution:

(Log[2] - Log[3])/(52*Log[2] - Log[643] - Log[2833] - Log[18251] - Log[143711])

which evaluates to 6.858618708478904:

(Log[2] - Log[3])/(52*Log[2] - Log[643] - Log[2833] - Log[18251] - Log[143711]) // N // FullForm

6.858618708478904`

Let’s take a look at the exact number Reduce is giving:

Quiet@Reduce[1.0609^t == 1.5, t, Reals] // N // FullForm

Equal[t, 6.858618708478698`]

So, it looks like Mathematica is solving the equation I meant to solve and evaluating this solution it at the end.

Summary of solutions
Here I summarize the solutions I’ve found for this problem:

  • 6.8586187084788275 – Pen,Paper+Mathematica for final evaluation.
  • 6.858618708478698 – Mathematica solving the exact problem I mean and evaluating to double precision at the end.
  • 6.858618708478904 – Mathematica solving the exact problem derived from what I really asked it and evaluating at the end.

What I find fun is that my simple minded pen and paper solution seems to satisfy the original equation better than the solutions arrived at by more complicated means.  Using MATLAB’s vpa again I plug in the three solutions above, evaluate to 50 digits and see which one is closest to 1.5:

>> vpa('1.5 - 1.0609^6.8586187084788275',50)
ans =
-0.00000000000000045535780896732093552784442911868195148156712149736
>> vpa('1.5 - 1.0609^6.858618708478698',50)
ans =
0.000000000000011028236861872639054542278886208515813177349441555
>> vpa('1.5 - 1.0609^6.858618708478904',50)
ans =
-0.0000000000000072391029234017787617507985059241243968693347431197

So, ‘my’ solution is better. Of course, this advantage goes away if I evaluate (Log[2] – Log[3])/(2*(2*Log[2] + 2*Log[5] – Log[103])) to 50 decimal places and plug that in.

>> sol=vpa('(log(2) - log(3))/(2*(2*log(2) + 2*log(5) - log(103)))',50)
sol =
6.858618708478822364949699852597847078441119153527
>> vpa(1.5 - 1.0609^sol',50)
ans =
0.0000000000000000000000000000000000000014693679385278593849609206715278070972733319459651
February 4th, 2013

Welcome to the first Month of Math Software for 2013.  January was a rather lean month in the world of mathematica software I’m afraid but there are a few things worthy of attention.  If you have some news for me for next month’s edition, contact me via the usual channels.

Commerical computer algebra

  • PTC have released a cut-down version of Mathcad Prime called Mathcad Express.  It was actually launched back in October 2012 but I only learned about it this month.  Regular readers of WalkingRandomly will also know about SMath Studio, a freeware application that looks a bit like a clone of Mathcad and runs on many operating systems.
  • Wolfram Research have released Mathematica 9.0.1, a minor upgrade from version 9.  To see what’s new take a look at Wolfram’s quick revision history http://www.wolfram.com/mathematica/quick-revision-history.html (Thanks to people in the comments section for this link)

Python

  • Pandas, a data analysis library for Python, saw a minor update to version 0.10.1.  See the pandas ‘What’s new?’ page for more details.
  • Version 5.6 of the Python-based computer algebra system, SAGE, has been released.  See the changelog for details of the new stuff.

C++

  • Blaze is “an open-source, high-performance C++ math library for dense and sparse arithmetic” and it has seen its second release.  Head over to Blaze’s website to grab yourself a copy of version 1.1.

Shiny

  • Shiny wasn’t released in January but this was the first month I heard about it and it looks fantastic.  Brought to my attention by long time WalkingRandomly reader, ‘MySchizoBuddy’, Shiny is brought to us from the creators of RStudio.  In his words ‘It’s similar to Mathematica’s CDF plugin but without the plugin. It allows you to have small R code and visualizations on the web without any plugins http://www.rstudio.com/shiny/’

All your probability distribution are belong to us

January 3rd, 2013

Welcome to the last 2012 edition of A Month of Math Software..slightly delayed thanks to the December festivities.  Thanks to everyone who’s contributed news items over the last 2 years, please feel free to continue contacting me throughout 2013 and beyond.

AccelerEyes sells the MATLAB Jacket to The Mathworks

  • AccelerEyes are the developers of GPU accelerated products such as Jacket for MATLAB and ArrayFire for C, C++, and Fortran.  In a recent blog post, they announced that they have sold Jacket to The Mathworks.  It will be interesting to see how The Mathworks integrate this technology into the Parallel Computing Toolbox (PCT) in the future.  I sincerely hope that they don’t split the PCT into two products, one for GPUs and the other for CPUs!

Free computer algebra

Numerical Libraries

  • Version 5.3 of the ACML linear algebra library for AMD-based systems was released in December.
  • Another of AMD’s libraries was updated this month.  The Accelerated Parallel Processing (APP) SDK hit version 2.8 and includes a preview of AMD’s new C++ template library, Codename “Bolt.”.  According to AMD, Bolt ‘makes it easier for developers to utilize the inherent performance and power efficiency benefits of heterogeneous computing’ The press release for this version of the APP SDK is available at http://developer.amd.com/wordpress/media/2012/10/APP-SDK-Bolt-CodeXL-Press-Release.pdf.  Also, click here for more details concerning Bolt
  • Numeric Javascript saw two releases, v1.2.5 and v1.2.6
  • The HSL Software Library was updated this month adding three new routines to support Fredholm alternative for singular systems, efficient multiplication of the factors by a vector, and sparse forward solve.
  • amgcl is an accelerated algebraic multigrid for C++.  According to the software’s website ‘You can use amgcl to solve large sparse system of linear equations in three simple steps: first, you have to select method components (this is a compile time decision); second, the AMG hierarchy has to be constructed from a system matrix; and third, the hierarchy is used to solve the equation system for a given right-hand side’

Data Analysis and Visualisation

Maple IDE

  • DigiArea have released an Eclipse based Integrated Development Environment (IDE) for Maple called, simply, Maple IDE.  This commercial product is available for Linux, Windows and Mac OS X and seems to be a very similar concept to Wolfram’s Workbench for Mathematica.

Maple IDE

Spreadsheets

Commercial Number Theory