November 23rd, 2015 | Categories: just for fun, natural language, programming, R, The internet | Tags:

A recent trend on Facebook is to create a wordcloud of all of your posts using an external service. I chose not to use it because I tend to use Facebook for personal interactions among close friends and I didn’t want to send all of my data to another external company.

Twitter is a different matter, however! All of the data is open and it’s very easy to write a computer program to generate Twitter world clouds without the need for an external service.

I wrote a simple script in R that generates a wordcloud from the most recent 3200 tweets and outputs the top 200 words (get the code on github). The script removes many of the uninteresting words such as the, of, and that would otherwise dominate the cloud. These stopwords come from the Top100Words list of the R package qdap but I also added a few more such as ‘just’ and ‘me’ that I seem to use a lot.

This is the current wordcloud for my twitter account, walkingrandomly. Click on the image to see a bigger version. My main interests are very clear – Python programming, research software, data and anything that’s new!


Once I had seen my wordcloud, I wondered how things would look for other twitter users who I pay a lot of attention to. This is how it looks for Manchester University’s Nick Higham. Clearly he’s big on SIAM, Manchester, and Matrix Analysis!



I then looked at my manager at Sheffield University, Neil Lawrence. Neil finds data and the city of Sheffield very important and also writes about workshops, science, blog posts and machine learning a lot.



The R code that generated these wordclouds is available on github but it won’t work out of the box. You’ll need to register with twitter for app development (It’s free and fairly straightforward) and get various access keys before you can use the code.

November 19th, 2015 | Categories: Carnival of Math | Tags:

Welcome to the 128th Carnival of Mathematics, the latest in a mathematical blogging tradition that’s been ongoing for over 8 years now!

Facts about 128

It’s said that every number is interesting and 128 is no exception. 128 is the largest number which is not the sum of distinct squares whereas it is the smallest number n such that dropping the first and the last digit of n leaves its largest prime factor (thanks, Number Gossip).

Wikipedia tells us that it is divisible by the total number of its divisors, making it a refactorable number. Additionally, 128 can be expressed by a combination of its digits with mathematical operators thus 128 = 28 – 1, making it a Friedman number in base 10.

128 was also the number of kilobytes of memory available in the magnificent computer shown below.

© Bill Bertram 2006, CC-BY-2.5 — Attribution


The Princeton Companion to Applied Mathematics
I recently received a copy of the The Princeton Companion to Applied Mathematics and it’s just beautiful, definitely recommended as a christmas gift for the maths geek in your life. The companion’s editor, Nick Higham, has written a few blog posts about it – Companion authors speaking about their work, Famous Mathematicians and The Princeton Companion and How to Use The Princeton Companion to Applied Mathematics.

We have a lot of problems, and that’s a good thing
‘Diane G’ submitted this advanced knowledge problem — great practice for advanced mathematics. This blog is amazing and posts practice problems every Monday and advanced problems every Wednesday.

Linear Programming
Laura Albert McLay of Punk Rock Operations Research (great blog title!) submitted two great posts: Should a football team run or pass? A game theory and linear programming approach and dividing up a large class into discussion sections using integer programming

Francisco Yuraszeck submitted 10 Things You need to know about Simplex Method saying ‘This article is about the basics concepts of Linear Programming and Simplex Method for beginers in Operations Research.

Stuart Mumford demonstrates various ways of computing the first 10,000 numbers in the Fibonacci Sequence using Python — and some are much faster than others. Laurent Gatto followed up with a version in R.

Cleve Moler, the original developer of MATLAB, looks at three algorithms for finding a zero of a function of a real variable:

Michael Trott of Wolfram Research looks at Aspect Ratios in Art: What Is Better Than Being Golden? Being Plastic, Rooted, or Just Rational? Investigating Aspect Ratios of Old vs. Modern Paintings

Andrew Collier explores Fourier Techniques in the Julia programming language.


The Numerical Algorithm Groups’s John Muddle looks at solving The Travelling Rugby Fan Problem.

Robert Fourer gives us two articles on Quadratic Optimization Mysteries: Part 1 and Part2. These are posts concerned with computational aspects of mathematical optimization, and specifically with the unexpected behavior of large-scale optimization algorithms when presented with several related quadratic problems.

Why Was 5 x 3 = 5 + 5 + 5 Marked Wrong
This image went viral recently

It generated a LOT of discussion. Brett Berry takes a closer look in Why Was 5 x 3 = 5 + 5 + 5 Marked Wrong.


Katie Steckles submitted an article that analyses the different visual themes explored by M.C. Escher in his artwork

Shecky R writes about our curious fascination with eccentric and top-notch mathematicians in Pursuing Alexander.

Brian Hayes has been Pumping the Primes and asks “Should we be surprised that a simple arithmetic procedure–two additions, a gcd, and an equality test–can pump out an endless stream of pure primality?”

Next time

Carnival of Maths #129 will be delivered by the team at Ganit Charcha. Head over to the main carnival website for more details.

October 16th, 2015 | Categories: Carnival of Math | Tags:

The Carnival of Mathematics has become something of a tradition in the mathematics blogging community. It’s been going since 2007 which makes it an old-timer in internet terms. The carnival is hosted by a different blogger every month and next month it’s my turn again! If you’d like to see what one of my Carnival of Maths blog posts looks like, take a look at how I did it in the past.

I’ll be hosting carnival number 128 so if you’d like your mathematics blog article featured please send me a submission using this form.

October 15th, 2015 | Categories: walking randomly | Tags:

“It’s OK for you, you never really fail at anything.” my friend accused me as I was trying to convince her to apply for a hotly-contested promotion.

This left me a little stunned because I fail all the time. Not just occasionally, not just with small-time stuff but ALL. THE. TIME! Sometimes I fail so hard that the sense of loss, of failure that I feel is almost physical.

I wanted to set the record straight…give an idea of how I the large and the small. So I told her of some of the huge bucket of fail that is me. Bear in mind that this is by no means a complete list…this is just some of the stuff that I feel comfortable talking about!


Before I started my PhD, I attempted to complete a PGCE (Qualification required to become a school teacher in the UK). I crashed and burned halfway through the course and dropped out. I don’t remember a time when I was more miserable in my professional life! I still find it difficult to think about that year and even more difficult to talk about it.

After my PhD, I applied for dozens of jobs, both academic and commercial, before I landed the job that changed my life at The University of Manchester. During my time at Manchester I failed to be promoted several times before I finally achieved it.

Now I’m at The University of Sheffield and things are better than they’ve ever been! I have a truly wonderful job! I tried to get here several years ago, however, guessed it…I failed (Worst. Interview. Ever!)


To program, to sysadmin is to fail…often. I try something, it fails. I try something else, it fails. On and on it goes until success is mine. Sometimes I never succeed. Researchers often come to me asking if I can speed up their code and sometimes, after several days of intense effort, I report back with a resigned ‘No. Sorry! – Here’s a list of things that don’t work’.

When I write code, I consider failure to be so likely that I assume that I definitely will fail (See Croucher’s law in this talk). I engineer my working practices to get past the inevitable failures as quickly as possible.

Physical failures

At the beginning of this article I said the feeling of failure can feel almost physical. Sometimes it IS physical such as the time I fought in the TAGB Semi-Contact Taekwondo World Championships (long time ago!) and had my ass neatly handed to me by the Canadian national champion! I subsequently failed to eat properly for 5 days straight thanks to the hardest left hook I’ve ever experienced in my life (semi-contact kinda goes out of the window at that level).

I gave everything I had in that fight and lost! When I tell people the story, however, I don’t dwell on the fact that I lost. I concentrate on the fact that I had the chops to be there. I remember the bit where I kicked him off his feet, I remember being knocked to the ground and feeling afraid of getting back up, of going back in but doing it anyway. I remember that one of the 4 judges awarded the fight to me so I couldn’t have done THAT badly. I remember my opponent hugging me after the fight and telling me that it had ‘been awesome’. I’m proud of that fight!

My Taekwondo days are far behind me now and, these days, I get my physical kicks by lifting weights. I concentrate on the major lifts such as Squat, Deadlift, Press and Bench Press. The aim is to always lift more and I still consider myself a novice. Most of the time, when I increase the weight, I fail. Failure is much more likely than success since I am working at the limit of my strength. The successes feel amazing though…to be able to say I’m stronger than I’ve ever been and to have numerical evidence…good times!

Future failures

I currently have several projects on the go – some personal, some professional — some large and some small. I fully expect to succeed with some of them and fail miserably in others. Some of the failures are going to hurt! A lot!

I’m waiting on the outcome of a major endeavour for me (EPSRC Research Software Engineering Fellowship) — something for which failure is highly likely since the competition is so fierce. I’ll not deny that I’m afraid but I’m also excited.

What has failure taught me?

When I reflect on my many failures, only a tiny subset of which are given above, I note that there are some common threads.

Success (and failure) is often half-chance I always suspected that this was the case and now that Neil Lawrence has done his NIPS experiment, we have data to back it up.

“whatever you do, don’t congratulate yourself too much or berate yourself either – your choices are half chance, so are everybody else’s.” Don’t forget the sunscreen

I fail a lot! Over time this has taught me that, although it hurts, the pain is rarely permanent. Over time, you get desensitised to it. Not so much that you become careless — you still prefer to avoid it — but enough to stop being afraid all the time. I’ve failed before, it’s not so bad!

My sense of identity is dominated more by my successes than my failures. I don’t dwell on my failures too much. I might allow myself to wallow in self-pity for a short-time following something big but at some point I’ll channel my inner-Pratchett “If failure had no penalty success would not be a prize”, see what I can learn from the failure and move on. My successes, on the other hand, stay with me for a long time! I don’t regret my failures but I relish my successes. This is probably why some people, such as my friend, think I don’t fail all that much. I rarely tell the stories unless I can spin them to make my failures look like a success.

People respect you more when you own up to failure. In my career, I’ve worked with a huge number of risk-averse people who are more interested in ensuring that no one thinks a failure is their fault than they are in fixing the failure. I try to put my hand up and say ‘My bad! Really sorry! I’m working on fixing it. Feel free to deliver me a kicking if you think I deserve it.’

I find that, rather than delivering said-kicking, most people choose to appear alongside me, shovel in hand and we dig ourselves out of the hole together.

I succeed a lot! I put myself ‘out there’ a lot. On this blog, in my job, with my friends. I try things that other people might shy away from, I take risks. I fail.

I also succeed! The net result has been a wonderful wife, a great group of friends, fantastic job, good fitness…great life.


A CV of failure – The Nature article that inspired me to write this blog post

October 13th, 2015 | Categories: Open Data Science, RSE, Scientific Software | Tags:

The Sheffield Open Data Science Initiative

The University of Sheffield Open Data Science Initiative (ODSI) is really starting to take off. So what is it?

From the website, the aims of the ODSI are:

  1. Make new analysis methodologies available as widely and rapidly as possible with as few conditions on their use as possible (see the ML@SITraN group software pages and the local software page).
  2. Educate our commercial, scientific and medical partners in the use of these latest methodologies (see
  3. Act to achieve a balance between data sharing for societal benefit and the right of an individual to own their data. (see our summary of our efforts in public understanding and debate)

My role within this initiative is to work on various aspects of research software throughout the University of Sheffield (and beyond!). I am a fellow of the Software Sustainability Institute and you could sum up everything I try to do with their motto Better Software, Better Research.

Join us on October 20th 2015

We have just started a programme of events which aims to bring together a wide variety of people interested in data, machine learning and research software (my favourite part!). The first such event is at The Data Hide on October 2015 at University of Sheffield.

There will be talk on Research Data Management for Computational Science by @ctjacobs_uk as well as lightning talks: What Kind of AI are we Creating? by @lawrenndMachine Learning for Chemical Simulations by Chris Handley and a demonstration of how great Reveal.js is by me.

This will be followed by food, beer and an opportunity to chat and geek out.

We would be honoured if you would join us.

Would you like to present at a future event?

Contact me to see what we can do together.

October 7th, 2015 | Categories: Apple, Guest posts | Tags:

This is a guest article written by friend, ex-colleague and keen user of Apple and Dropbox products, Ian Cottam of Manchester University’s IT Services.

Twice recently I have been bitten by being an early adopter of new software releases. Of course, I partly do this so colleagues at The University of Manchester don’t have to. Further, I have three Apple Macs and only update my least used one initially.

The two updates that bit me are: Mac OS X 10.11.0 El Capitan and the new Teams feature of Dropbox. My advice is not to use either of these updates (yet), unless you really need to. Interestingly, I persevered and have stayed with the Teams feature
of Dropbox; but have uninstalled OS X El Capitan by reverting to 10.10 Yosemite from a Time Machine backup.

Let’s do OS X El Capitan first.

This operating system update experience is the worst I can remember, and I have a long memory. What the legions of beta testers were doing I cannot imagine. To be fair, I expect many of them reported issues to Apple and just assumed Apple would not release to the public at large before they were fixed. I know I thought Apple would not release an OS that would kernel panic for many users on boot up. How wrong we all were.

A major, low-level change that Apple made for El Capitan was in the area of security; that is: kernel extensions, as used by some third party applications, have to be digitally signed to be acceptable. So far, so sensible. Some examples that I use include: VirtualBox, VMware, ncryptedCloud, github.osxfuse and Avatron. To see if you use any third party ones too, you can type the following into Terminal:

kextstat | grep -v

I expect over time all of the above will be updated to be digitally signed. However, at the time of writing some of them are not. Now quite why El Capitan does not log such unsigned extensions and ignore them on boot up I do not know. Instead you get a kernel panic and your Mac has become a brick. Well, not quite a brick as you can boot into safe mode (where extensions are not loaded) and try and fix things. But which ones are causing the problem? Not to mention that I would like to continue working with most, if not all, of them.

Googling shows that VirtualBox before version 5 and possibly nCrypted Cloud can ’cause’ the kernel panic; and I have several colleagues who persevered with updating to El Capitan by removing VirtualBox 4. I went back to Yosemite. (The fact that my Time Machine backup wasn’t as up to date as I would have liked is purely my fault. My other two Macs back up to Time Machine drives automatically, but my new Macbook waits for an external USB drive to be plugged in. I did try a restore from one of the other Time Machine drives that was 100% up to-date, but sadly the results were poor – e.g. the display driver – and I re-started the process from the specific but slightly out of date back-up.)

If you think that you can get around or live with this issue, I would further caution you to Google for Microsoft Office Problems El Capitan, to be further shocked. Ditto: Microsoft Outlook 2011 Problems El Capitan, if you are, like me, an Outlook/Exchange user.

I should repeat that some of my colleagues have updated to El Capitan and have not hit my problems or have worked around them.

Now on to Dropbox and its new Teams feature.

Teams is Dropbox’s way of bringing some of the advantages of Dropbox for Business to Dropbox Basic (the free version with quite limited storage) and Dropbox Pro (paid for, giving either 1TB or 2TB storage limits). Dropbox’s reason for doing this is so more people will update to Dropbox for Business (unlimited storage and greater admin control, etc.).
My first reason for being wary of Teams is that any member of a Team – not just the lead – can press the Update to Dropbox for Business button at anytime, committing and converting all in the Team. Now no doubt you can back out after a short trial period, but I expect it is a messy business (no pun intended).

What does the Teams feature offer? The first is to create a team or group list, such that when sharing appropriate folders you don’t need to list everyone in the team every time. Similarly, when a new team member starts, and once their email address is added to the group list, they will get copies of all all the existing shared folders. Good feature. I created a Team with just me in it for initial testing.

The other feature of Teams – the one I was most interested in – is the ability to have separate Work and Personal dropboxes under a single account on your Mac, PC or Linux box. If you are coming to Dropbox fresh, I would say go ahead and set up this feature, as it’s extremely handy to keep work and personal stuff completely separate. However, I had a mixture of work and private folders totalling some 120GB: that’s quite a bit to separate out. Now you face the decision as to whether the dropbox you currently have becomes your business one or your personal one. As I pay for Dropbox Pro myself I thought this fairly arbitrary and chose to keep my existing account as the business one. Your mileage may vary but I discovered I had more personal stuff in my dropbox than business, so the other way around might have saved me quite some time.

Your actual folder is renamed from “Dropbox” to “Dropbox (Your Choice of Business Name)”. As many folk have moaned about that, they also create a link to it with the old name of “Dropbox”. The next step is to set up, in my case, a new personal account (using a different email address) and link it to my main one. This is fairly straightforward. It’s also well done how easily you switch between your business and personal boxes. If like me, you have a 1TB account, that amount is shared between the two, although it is not obvious at first and you may get the odd message about the small size of your new dropbox, which you can safely ignore.

The thing that took a long time was copying all the personal stuff from what is now my business account to the personal one. You have to use the Dropbox desktop client for this. If a given sub-folder is not a share, you can try just dragging it over. I realised though that many of mine were shared. The only safe, if tedious, route I could find was to add my new personal identity to these shares; then transfer ownership – assuming I owned the share – and finally after the sync finished, remove my original identity from the share, unticking the box that says Keep a Copy. There might be an easier way: I hope so. I had a lot of shares to work through, and of course you are doing this on just one of the machines you own.

When done, I set up my second Mac, which was fairly straightforward and eventually everything synchronised to the right place. The big issue I had was with a third Mac that was further out of step. You guessed it: it was the one I had restored to Yosemite as mentioned in the first half of this blog. In such circumstances, Dropbox thinks you want to put lots of the folders back in to the original and now business identity. You can imagine how long that took and took me to unwind it back to how I had it with the two other Macs.

Perhaps I was both unlucky and made a bad choice or two, but be warned: for existing users with a complex dropbox setup this is rather painful to go through. I’m glad I did, but don’t find it easy to recommend to colleagues (yet).

Dropbox describe Teams here

Be careful out there, and don’t be an early adopter unless you are prepared for the pain.

October 5th, 2015 | Categories: RSE | Tags:

I was recently invited to give a talk at a Machine Learning in Personalised Medicine summer school in Manchester and decided to expand my blog post Is Your Research Software Correct? into a full 1 hour presentation.

I’m happy to say that it was extremely well received – both by many people at the event and by a few people on twitter.

Here are the slides

September 5th, 2015 | Categories: math software, Numerics, python | Tags:

The test suite of a project I’m working on is poking around at the extreme edges of the range of double precision numbers. I noticed a difference between Windows and other platforms that I can’t yet fully explain. On Windows, the test suite was pumping out RuntimeWarnings that we don’t see in Linux or Mac. I’ve distilled the issue down to a single numpy command:


On Windows 7 Anaconda Python 2.3, this gives a RuntimeWarning and returns inf whereas on Linux and Mac OS X it evaluates to 709.78-ish

Numpy version is 1.9.2 in all cases.

64 bit Windows 7

Python 2.7.10 |Continuum Analytics, Inc.| (default, May 28 2015, 16:44:52) [MSC
v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: and
>>> import numpy as np
>>> np.log1p(1.7976931348622732e+308)
__main__:1: RuntimeWarning: overflow encountered in log1p

64 bit Linux

Python 2.7.9 (default, Apr  2 2015, 15:33:21) 
[GCC 4.9.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> np.log1p(1.7976931348622732e+308)

Mac OS X

Python 2.7.10 |Anaconda 2.3.0 (x86_64)| (default, May 28 2015, 17:04:42) 
[GCC 4.2.1 (Apple Inc. build 5577)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: and
>>> import numpy as np
>>> np.log1p(1.7976931348622732e+308)

The argument to log1p is getting close to the largest double precision number:

>>> sys.float_info.max
August 21st, 2015 | Categories: walking randomly | Tags:

I’m a geek and love hanging out with other geeks. There are many different flavours of geek: gym geeks, food geeks, beer geeks, math geeks, science geeks, greyhound geeks…you get the idea! There’s a quote attributed to Simon Pegg that sums up geekdom perfectly for me:

“Being a geek is all about being honest about what you enjoy and not being afraid to demonstrate that affection. It means never having to play it cool about how much you like something. It’s basically a license to proudly emote on a somewhat childish level rather than behave like a supposed adult. Being a geek is extremely liberating.” Simon Pegg

Pure, unbridled enthusiasm is extremely infectious. Since I work at a University, I am extremely fortunate in that I often find myself on the receiving end of a combination of outrageous optimism and advanced-geek enthusiasm – a powerful combination that results in people changing the world.

Long ago, I discovered that you could learn a lot about the world simply by sharing a coffee or beer with a geek or two and finding out what they think is cool. Keep a note of any google-able phrases they come up with and follow up later. Much more fun, and probably more efficient, than sitting through dozens of conference talks.

Here are a few computer-based things I learned from such conversations this week:


July 30th, 2015 | Categories: Windows | Tags:

Due to the demands of my job and the fact that I like shiny new technology, I’m pretty much operating system agnostic these days. I find myself flitting between Windows, Linux, Mac OS X, Android and iOS on a regular basis and find them all delightful and head-smackingly frustrating in equal measure.

One of my geeky guilty pleasures is taking some time out to kick the tyres of a new operating system and so I’m having a lot of fun with Windows 10 right now. I tend not to play with preview builds so all of this is new to me.

The Windows command prompt hasn’t seen much love in decades and yet it’s so important to the work I do. In Windows 10, it’s received a much needed update. Right at the top of the list are improvements to copy and paste. In older versions of windows, this is my workflow I go to a new machine:

  • Using CTRL-V, try to copy and paste into the command line. It doesn’t work. You get ^V appear instead
  • Sigh and mutter to yourself. Right click on the top of the window. Choose properties. Enable Quick Edit mode.
  • Press CTRL-V again. ^V appears. Press it a few more times ^V^V^V^V
  • Remember that in cmd.exe, unlike all other applications in Windows, the way to paste is to right click. Mutter again. Get on with life.

In Windows 10, quick edit mode is enabled by default and CTRL-V just works. Happy days!

There’s a whole host of other improvements including word-wrap, transparencies, the ability to resize the window and more. I feel like the Windows command prompt has taken its first step into a larger world.

Microsoft have also set up a discussion forum for the future of the Command Prompt.