Archive for the ‘math software’ Category

October 13th, 2016

I was recently invited to give a talk at the Sheffield R Users Group and decided to give a brief overview of how R relates to other technologies. Subjects included Mathematica’s integration of R, Intel’s compilers, Math Kernel Library and how they can make R faster and a range of Microsoft technologies including R Tools for Visual Studio, Microsoft R Open and the MRAN for reproducibility. I also touched upon the NAG Library, Maple’s code generation for R, GPUs and Spark.

Did I miss anything? If you were to give a similar talk, what might you have included?

June 13th, 2016

I was in a funk!

Not long after joining the University of Sheffield, I had helped convince a raft of lecturers to switch to using the Jupyter notebook for their lecturing. It was an easy piece of salesmanship and a whole lot of fun to do. Lots of people were excited by the possibilities.

The problem was that the University managed desktop was incapable of supporting an instance of the notebook with all of the bells and whistles included. As a cohort, we needed support for Python 2 and 3 kernels as well as R and even Julia. The R install needed dozens of packages and support for bioconductor. We needed LateX support to allow export to pdf and so on. We also needed to keep up to date because Jupyter development moves pretty fast! When all of this was fed into the managed desktop packaging machinery, it died. They could give us a limited, basic install but not one with batteries included.

I wanted those batteries!

In the early days, I resorted to strange stuff to get through the classes but it wasn’t sustainable. I needed a miracle to help me deliver some of the promises I had made.

Miracle delivered – SageMathCloud

During the kick-off meeting of the OpenDreamKit project, someone introduced SageMathCloud to the group. This thing had everything I needed and then some! During that presentation, I could see that SageMathCloud would solve all of our deployment woes as well as providing some very cool stuff that simply wasn’t available elsewhere. One killer-application, for example, was Google-docs-like collaborative editing of Jupyter notebooks.

I fired off a couple of emails to the lecturers I was supporting (“Everything’s going to be fine! Trust me!”) and started to learn how to use the system to support a course. I fired off dozens of emails to SageMathCloud’s excellent support team and started working with Dr Marta Milo on getting her Bioinformatics course material ready to go.

TL; DR: The course was a great success and a huge part of that success was the SageMathCloud platform

Giving back – A tutorial for lecturers on using SageMathCloud

I’m currently working on a tutorial for lecturers and teachers on how to use SageMathCloud to support a course. The material is licensed CC-BY and is available at https://github.com/mikecroucher/SMC_tutorial 

If you find it useful, please let me know. Comments and Pull Requests are welcome.

December 23rd, 2015

Some numbers have something to say. Take the following, rather huge number, for example:

185325291040682644803531312384041336595151018761127807725763308064246070395230764956468856341399670487
514610052487586323067575687914642829757636555138456145938430191876551756992329818006401775522301219016
237245425891544032218544390861818271526845858747648909382915665997160517028671058273052955697138350617
856171748990490346558484883522495310587304606877332488244886849690319641412147118669050542398759303832
627672479768452329971883073420877438596419179762421854464516060347269129680634374662501202129049727949
71185874579656679344857677824

This number wants to tell you ‘Happy Holidays’, it just needs a little code to help it out.  In Maple, this code is:

n := 18532529104068264480353131238404133659515101876112780772576330806424607039523076495646885634139967048751461005248758632306757568791464282975763655513845614593843019187655175699232981800640177552230121901623724542589154403221854439086181827152684585874764890938291566599716051702867105827305295569713835061785617174899049034655848488352249531058730460687733248824488684969031964141214711866905054239875930383262767247976845232997188307342087743859641917976242185446451606034726912968063437466250120212904972794971185874579656679344857677824:
modnew := proc (x, y) options operator, arrow; x-y*floor(x/y) end proc:
tupper := piecewise(1/2 < floor(modnew(floor((1/17)*y)*2^(-17*floor(x)-modnew(floor(y), 17)), 2)), 0, 1):
points := [seq([seq(tupper(x, y), y = n+16 .. n, -1)], x = 105 .. 0, -1)]:
plots:-listdensityplot(points, scaling = constrained, view = [0 .. 106, 0 .. 17], style = patchnogrid, size = [800, 800]);

The result is the following plot

Screen Shot 2015-12-23 at 13.03.12

Thanks to Samir for this one!

The mathematics is based on a generalisation of Tupper’s self-referential formula.

There’s more than one way to send a message with an equation, however. Here’s an image of one I discovered a few years ago — The equation that says Hi

December 13th, 2015

Way back in 2008, I wrote a few blog posts about using mathematical software to generate christmas cards:

I’ve started moving the code from these to a github repository. If you’ve never contributed to an open source project before and want some practice using git or github, feel free to write some code for a christmas message along similar lines and submit a Pull Request.

September 5th, 2015

The test suite of a project I’m working on is poking around at the extreme edges of the range of double precision numbers. I noticed a difference between Windows and other platforms that I can’t yet fully explain. On Windows, the test suite was pumping out RuntimeWarnings that we don’t see in Linux or Mac. I’ve distilled the issue down to a single numpy command:

np.log1p(1.7976931348622732e+308)

On Windows 7 Anaconda Python 2.3, this gives a RuntimeWarning and returns inf whereas on Linux and Mac OS X it evaluates to 709.78-ish

Numpy version is 1.9.2 in all cases.

64 bit Windows 7

Python 2.7.10 |Continuum Analytics, Inc.| (default, May 28 2015, 16:44:52) [MSC
v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://binstar.org
>>> import numpy as np
>>> np.log1p(1.7976931348622732e+308)
__main__:1: RuntimeWarning: overflow encountered in log1p
inf

64 bit Linux

Python 2.7.9 (default, Apr  2 2015, 15:33:21) 
[GCC 4.9.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> np.log1p(1.7976931348622732e+308)
709.78271289338397

Mac OS X

Python 2.7.10 |Anaconda 2.3.0 (x86_64)| (default, May 28 2015, 17:04:42) 
[GCC 4.2.1 (Apple Inc. build 5577)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://binstar.org
>>> import numpy as np
>>> np.log1p(1.7976931348622732e+308)
709.78271289338397

The argument to log1p is getting close to the largest double precision number:

>>> sys.float_info.max
1.7976931348623157e+308
July 13th, 2015

It is possible to write quick, interactive demonstrations in a variety of languages these days. Functions such as Mathematica’s Manipulate, Sage Math’s interact and IPython’s interact allow programmers to write functional graphical user interfaces with just a few lines of code.

Earlier this week, I hosted a session in the Faculty of Engineering at The University of Sheffield where Maplesoft showed us, among other things, their version of this technology. This blog post is an extension of my notes from this part of the session.

The series command expands a function as a power series around a point. For example, let’s expand sin(x) as a power series around the point x=0.

series(sin(x), x = 0, 10)

Screen Shot 2015-07-01 at 14.05.52
If we try to plot this, we get an error message

plot(series(sin(x), x = 0, 10), x = -2*Pi .. 2*Pi, y = -3 .. 3)

Warning, unable to evaluate the function to numeric values in the region; see the plotting command's help page to ensure the calling sequence is correct

This is because the output of the series command is a series data structure — something that the plot function cannot handle. We can, however, convert this to a polynomial which is something that the plot function can handle

convert(series(sin(x), x = 0, 10), polynom)

screenshot_2_wed_1stjuly
Wrapping the above with plot gives:

plot(convert(series(sin(x), x = 0, 10), polynom), x = -2*Pi .. 2*Pi, y = -3 .. 3);

Screen Shot 2015-07-08 at 08.33.03
Let’s see how close this is to the sin(x) curve by plotting them both together

plot([sin(x), convert(series(sin(x), x = 0, 10), polynom)], x = -2*Pi .. 2*Pi, y = -3 .. 3);

Screen Shot 2015-07-08 at 08.37.56
It would be nice if we could see how the approximation varies as we vary the number of terms in the expansion. Change the value 10 to a parameter a, pass the whole thing to the Explore function and we get an interactive widget.

Explore(plot([sin(x), convert(series(sin(x), x = 0, a), polynom)], x = -2*Pi .. 2*Pi, y = -3 .. 3), parameters = [a = 2 .. 20]);

Here’s a screenshot of it:

Screen Shot 2015-07-08 at 08.42.49
Adding extra parameters
It would also be nice to vary the point we expand around. Change the value 0 to b and add an extra parameter to Explore to get two sliders instead of one:

Explore(plot([sin(x), convert(series(sin(x), x = b, a), polynom)], x = -2*Pi .. 2*Pi, y = -3 .. 3), parameters = [a = 2 .. 20, b = -2*Pi .. 2*Pi]);

To see what this looks like, open the companion worksheet in Maple.

Adding labels to the sliders
We can change the labels on the sliders as follows

Explore(plot([sin(x), convert(series(sin(x), x = b, a), polynom)], x = -2*Pi .. 2*Pi, y = -3 .. 3), parameters = [[a = 2 .. 20, label = `Number Of Terms`], [b = -2*Pi .. 2*Pi, label = `Expansion location`]]);

To see what this looks like, open the companion worksheet in Maple.

Adding initial values
Finally, let’s set some starting values for each slider

Explore(plot([sin(x), convert(series(sin(x), x = b, a), polynom)], x = -2*Pi .. 2*Pi, y = -3 .. 3), parameters = [[a = 2 .. 20, label = `Number Of Terms`], [b = -2*Pi .. 2*Pi, label = `Expansion location`]], initialvalues = [a = 2, b = 1]);

The resulting interactive widget looks like this:

Screen Shot 2015-07-08 at 08.53.35

 

Not bad for one line of code!

Upload to the Maple Cloud

At The University of Sheffield, we are lucky because all of our staff and students have access to Maple on both university-owned and personally-owned equipment. If your audience isn’t as fortunate, they can access the resulting worksheet on the Maple Cloud.

May 22nd, 2015

Update: 2nd July 2015 The code in github has moved on a little since this post was written so I changed the link in the text below to the exact commit that produced the results discussed here.

Imagine that you are a very new MATLAB programmer and you have to create an N x N matrix called A where A(i,j) = i+j
Your first attempt at a solution might look like this

N=2000
% Generate a N-by-N matrix where A(i,j) = i + j;
for ii = 1:N
     for jj = 1:N
         A(ii,jj) = ii + jj;
     end
 end

On my current machine (Macbook Pro bought in early 2015), the above loop takes 2.03 seconds. You might think that this is a long time for something so simple and complain that MATLAB is slow. The person you complain to points out that you should preallocate your array before assigning to it.

N=2000
% Generate a N-by-N matrix where A(i,j) = i + j;
A=zeros(N,N);
for ii = 1:N
     for jj = 1:N
         A(ii,jj) = ii + jj;
     end
 end

This now takes 0.049 seconds on my machine – a speed up of over 41 times! MATLAB suddenly doesn’t seem so slow after all.

Word gets around about your problem, however, and seasoned MATLAB-ers see that nested loop, make a funny face twitch and start muttering ‘vectorise, vectorise, vectorise’. Emails soon pile in with vectorised solutions

% Method 1: MESHGRID.
[X, Y] = meshgrid(1:N, 1:N);
A = X + Y;

This takes 0.025 seconds on my machine — a healthy speed-up on the loop-with-preallocation solution. You have to understand the meshgrid command, however, in order to understand what’s going on here. It’s still clear (to me at least) what its doing but not as clear as the nice,obvious double loop. Call me old fashioned but I like loops…I understand them.

% Method 2: Matrix multiplication.
A = (1:N).' * ones(1, N) + ones(N, 1) * (1:N);

This one is MUCH harder to read but you don’t worry about it too much because at 0.032 seconds it’s slower than meshgrid.

% Method 3: REPMAT.
A = repmat(1:N, N, 1) + repmat((1:N).', 1, N);

This one appears to be interesting! At 0.009 seconds, it’s the fastest so far – by a healthy amount!

% Method 4: CUMSUM.
A = cumsum(ones(N)) + cumsum(ones(N), 2);

Coming in at 0.052 seconds, this cumsum solution is slower than the preallocated loop.

% Method 5: BSXFUN.
A = bsxfun(@plus, 1:N, (1:N).');

Ahhh, bsxfun or ‘The Widow-maker function’ as I sometimes refer to it. Responsible for some of the fastest and most unreadable vectorised MATLAB code I’ve ever written. In this case, it brings execution time down to 0.0045 seconds.

Whenever I see something that can be vectorised with a repmat, I try to figure out if I can rewrite it as a bsxfun. The result is usually horrible to look at so I tend to keep the original loop commented out above it as an explanation! This particular example isn’t too bad but bsxfun can quickly get hairy.

Conclusion

Loops in MATLAB aren’t anywhere near as bad as they used to be thanks to advances in JIT compilation but it can often pay, speed-wise, to switch to vectorisation. The price you often pay for this speed-up is that vectorised code can become very difficult to read.

If you’d like the code I ran to get the timings above, it’s on github (link refers to the exact commit used for this post) . Here’s the output from the run I referred to in this post.

Original loop time is 2.025441
Preallocate and loop is 0.048643
Meshgrid time is 0.025277
Matmul time is 0.032069
Repmat time is 0.009030
Cumsum time is 0.051966
bsxfun time is 0.004527
  • MATLAB Version: 2015a
  • Early 2015 Macbook Pro with 16Gb RAM
  • CPU: 2.8Ghz quad core i7 Haswell

This post is based on a demonstration given by Mathwork’s Ken Deeley during a recent session at The University of Sheffield.

February 19th, 2015

I’ve been working at The University of Manchester for almost a decade and will be leaving at the end of this week! A huge part of my job was to support a major subset of Manchester’s site licensed application software portfolio so naturally I’ve made use of a lot of it over the years. As of February 20th, I will no longer be entitled to use any of it!

This article is the second in a series where I’ll look at some of the software that’s become important to me and what my options are on leaving Manchester.  Here, I consider MATLAB – a technical computing environment that has come to dominate my career at Manchester. For the last 10 years, I’ve used MATLAB at least every week, if not most days.

I had a standalone license for MATLAB and several toolboxes – Simulink, Image Processing, Parallel Computing, Statistics and Optimization. Now, I’ve got nothing! Unfortunately for me, I’ve also got hundreds of scripts, mex files and a few Simulink models that I can no longer run! These are my options:

Go somewhere else that has a MATLAB site license

  • I’ll soon be joining the University of Sheffield who have a MATLAB site license. A great option if you can do it.

Use something else

  • Octave – Octave is a pretty good free and open source clone of MATLAB and quite a few of my programs would work without modification. Others would require some rewriting and, in some cases, that rewriting could be extensive! There is no Simulink support.
  • Scilab – It’s free and it’s MATLAB-like-ish but I’d have to rewrite my code most of the time. I could also port some of my Simulink models to Scilab as was done in this link.
  • Rewrite all my code to use something completely different. What I’d choose would depend on what I’m trying to achieve but options include Python, Julia and R among others.

Compile!

  • If all I needed was the ability to run a few MATLAB applications I’d written, I could compile them using the MATLAB Compiler and keep the result. The whole point of the MATLAB Compiler is to distribute MATLAB applications to those who don’t have a MATLAB license. Of course once I’ve lost access to MATLAB itself, debugging and adding features will be  um……tricky!

Get a hobbyist license for MATLAB

  • MATLAB Home – This is the full version of MATLAB for hobbyists. Writing a non-profit blog such as WalkingRandomly counts as a suitable ‘hobby’ activity so I could buy this license. MATLAB itself for 85 pounds with most of the toolboxes coming in at an extra 25 pounds each. Not bad at all! The extra cost of the toolboxes would still lead me to obsess over how to do things without toolboxes but, to be honest, I think that’s an obsession I’d miss if it weren’t there! Buying all of the same toolboxes as I had before would end up costing me a total of £210+VAT.
  • Find a MOOC that comes with free MATLAB – Mathworks make MATLAB available for free for students of some online courses such as the one linked to here. Bear in mind, however, that the license only lasts for the duration of the course.

Academic Use

If I were to stay in academia but go to an institution with no MATLAB license, I could buy myself an academic standalone license for MATLAB and the various toolboxes I’m interested in. The price lists are available at http://uk.mathworks.com/pricing-licensing/

For reference, current UK academic prices are

  • MATLAB £375 + VAT
  • Simulink £375 + VAT
  • Standard Toolboxes (statistics, optimisation, image processing etc) £150 +VAT each
  • Premium Toolboxes (MATLAB Compiler, MATLAB Coder etc) – Pricing currently not available

My personal mix of MATLAB, Simulink and 4 toolboxes would set me back £1350 + VAT.

Commercial Use

If I were to use MATLAB professionally and outside of academia, I’d need to get a commercial license. Prices are available from the link above which, at the time of writing, are

  • MATLAB £1600 +VAT
  • Simulink £2400 + VAT
  • Standard Toolboxes £800 +VAT each
  • Premium Toolboxes – Pricing currently not available

My personal mix of MATLAB, Simulink and 4 toolboxes would set me back £7200 + VAT.

Contact MathWorks

If anyone does find themselves in a situation where they have MATLAB code and no means to run it, then they can always try contacting MathWorks and ask for help in finding a solution.

 

February 2nd, 2015

I’ve been working at The University of Manchester for almost a decade and will be leaving in just less than 3 weeks time! A huge part of my job was to support a major subset of Manchester’s site licensed application software portfolio so naturally I’ve made use of a lot of it over the years. As of February 20th, I will no longer be entitled to use any of it!

This article is the first in a series where I’ll look at some of the software that’s become important to me and what my options are on leaving Manchester.

Here, I consider Mathematica – a computer algebra system and technical computing environment that I’m very fond of. I’ve been a Mathematica user for over 15 years and yet, suddenly, I find myself license-less! So much code, so much time invested! What to do?

Options for all use cases

  • Before leaving University, contact the administrator of your site license. It could be that you are entitled to a discount on buying one of the various licenses on offer.
  • Use the CDF Player – With this free tool, You’ll be able to look at and interact (at least partially) with Mathematica notebooks.
  • Re-write all code to use something else. Which language to use is open to massive debate but the closest open source systems to Mathematica’s notebook-like interface are Jupyter (previously IPython) and Sage. The languages are, of course, rather different though!

Hobbyist use

General mucking around!

  • Buy the home edition – The home edition of Mathematica can be used for  non-professional and non-academic purposes and, at the time of writing, costs £195 as a one-off cost or £95 per year.
  • Use Mathematica online: Home – Same rules as the home edition above but it’s a cloud-based, online version. Currently costs £95 per year.
  • Buy a Raspberry Pi –  The Raspberry Pi comes with a free version of Mathematica! This means that you can buy an entire computer AND a copy of Mathematica for less than the standard home-use license. I had a play with Mathematica on the Raspberry Pi just over a year ago and it was very nice. Now that the faster, more powerful Raspberry Pi 2 has been released this option is even more compelling!

Academic use

If you want to use Mathematica in an academic environment that doesn’t have a site license, you’ll need to purchase an individual academic license. At the time of writing, that will cost £860 + VAT.

Professional use

There are various grades of professional license and the cost varies according to how many compute kernels you need or Wolfram Alpha API calls you want to make. Current prices start at £2,035 +VAT

 

 

November 11th, 2014

RLink is Mathematica’s interface to the R language – a feature that has been extremely popular since its debut in Mathematica version 9. It’s a great package but has one or two issues. For example, RLink makes use of a built in version of R which is currently stuck at the rather old version 2.14. Official support for the use of external versions of R and adding third-party libraries varies by operating system and version of Mathematica. Windows support is great — OS X support, not so much.

Expert Mathematica user Szabolcs Horvát has written the definitive guide on how to get RLink up and running with the latest version of R on all three major operating systems, building on earlier work by Leonid Shifrin and members of the Mathematica Stack Exchange community. Thanks to this work, we can now enjoy any version of R we like with Mathematica!