## EPSRC Research Software Engineering Fellow: Mike Croucher

April 18th, 2016

In this article, I find myself in the rather odd position of interviewing myself as part of my series of interviews with the new cohort of EPSRC Research Software Engineering Fellows.

Could you tell us a little about yourself and how you became a Research Software Engineer?

I completed a PhD in theoretical physics in 2005 at The University of Sheffield where my area of research was photonic crystals. The most important thing I learned during my PhD is that I was a lot better at solving computational problems than I was at Physics. In particular, I seemed to be a lot better at solving other people’s problems rather than inventing and solving my own.

This led me to consider a job at The University of Manchester in the centralised Applications Support Team with IT Services. This team looked fantastic! Its role was to support Manchester’s extensive site licensed application portfolio – MATLAB, Mathematica, Intel Compilers, SPSSAbaqus…that sort of thing. It included aspects of licensing, sysadmin, installation support, high performance computing, consultancy, teaching – you name it! Sadly, 6 months after I started at Manchester, there was the first of many IT department restructures and the team was disbanded.

The centralised service was devolved into the faculties (It’s centralised again now!) and I ended up in the faculty of Engineering and Physical Sciences.  I took the responsibility of supporting a portfolio of applications with me. Broadly speaking, I became ‘the MATLAB guy’ at Manchester but also started extending support into the open source world — Python, R, Octave and so on.  This was done organically: I went were the work took me. If lots of academics came with problems in foo, I learned about foo, solved problems in foo and started teaching how to do foo better. If ‘foo’ happened to be distasteful to me — such as VBA — well, tough! To be successful in support, you must do the job that’s put in front of you. It’s a little like an Accident and Emergency department.

Very early on, I learned that the path to a researcher’s heart was speed. They wanted results and they wanted them fast! There was a team at Manchester doing hardcore Fortran/MPI stuff and they had that market sewn up — there was no space for the likes of me there. So, I took advantage of being ‘The MATLAB guy’ and started offering programming services to whoever wanted them for free. I could often achieve 100x speed-ups with less than a day’s work and this made me pretty popular!

I got to see a lot of code and that’s where the problems started. I’d get code with names like phdresults_dec2006_ver12_broken_fixed_FORMIKE.m that wouldn’t run on my machine. I’d learn that my machine was the second machine it had ever been run on and I was the second person to ever see the code. There would be 1000s of lines of code with no version control and no tests which made refactoring scary!

I learned that a huge amount of computational research was being done inefficiently. I feel that I need to be very clear: when I say ‘inefficient’, I’m not referring to Fortran code, say, that’s not making optimal use of the cache, SIMD instructions or has poor scaling over 128 cores. When I was young, this is where I thought the work would be. Sure, there’s some, but that’s not where you can help the most people.

A much more common situation, I find, is a researcher who’s workflow includes a huge amount of manual work. PhD students (and in one notable case, a very senior professor!) who manually edit 100s of spreadsheets for example. A morning’s work spent on some simple automation can completely change their lives!

These experiences got me interested in how to improve the general level of software engineering practice in research. I became a Software Sustainability Institute fellow in 2013, discovered this huge, amazing community and the rest is history.

You recently left Manchester University to move to Sheffield? What was behind that?

Prof. Neil Lawrence met me in The Sheffield Tap one evening and said ‘How would you like to ditch your commute and change the world?’  He was interested in bringing some of the research software initiatives I’d worked on in Manchester over to Sheffield as part of his Open Data Science initiative.

Changing the world sounded interesting but he had me at ‘ditch the commute’ if I’m being honest. I’ve lived in Sheffield for 20 years and commuted to Manchester by train for 10 of those. On some days, my twitter feed @walkingrandomly degenerated into little more than rants against various train companies! I needed to stop.

Working with Neil and the University of Sheffield has been an amazing experience. There’s a vibrancy here that’s infectious and a strong desire to do better in Research Software. That Sheffield won 2 of the 7 Research Software Engineering fellowships on offer was like a dream come true. The other Sheffield RSE fellow is Paul Richmond and we’ve joined forces to provide the strongest research software service we can to The University of Sheffield.

What do you think is the role of a Research Software Engineer?

I’m going to lift the answer to this straight out of my fellowship application.

Technological development in software is more like a cliff-face than a ladder – there are many routes to the top, to a solution. Further, the cliff face is dynamic – constantly and quickly changing as new technologies emerge and decline. Determining which technologies to deploy and how best to deploy them is in itself a specialist domain, with many features of traditional research.

Researchers need empowerment and training to give them confidence with the available equipment and the challenges they face. This role, akin to that of an Alpine guide, involves support, guidance, and load carrying. When optimally performed it results in a researcher who knows what challenges they can attack alone, and where they need appropriate support. Guides can help decide whether to exploit well-trodden paths or explore new possibilities as they navigate through this dynamic environment.

These guides are highly trained, technology-centric, research-aware individuals who have a curiosity driven nature dedicated to supporting researchers by forging a research software support career. Such Research Software Engineers (RSEs) guide researchers through the technological landscape and form a human interface between scientist and computer. A well-functioning RSE group will not just add to an organisation’s effectiveness, it will have a multiplicative effect since it will make every individual researcher more effective. It has the potential to improve the quality of research done across all University departments and faculties.

Are there any downsides to being a Research Software Engineer?

Something I’ve learned from conducting these interviews is that there are several different types of ‘Research Software Engineer’. We are not a ‘one size fits all’ community! I think that one thing we all have in common is that we don’t fit the normal ‘money-in, papers-out’ model of many academics. This was brought up in Louise Brown’s interview and it strongly resonates with me. This situation makes it difficult for us to follow an academic-like career path.

It is extremely difficult, for example, to get promoted as a research programmer without attempting to become something you are not. Worse, it’s difficult to simply get a permanent job! Many RSEs are on short term contracts with low salaries. In short, you get much of the grief of working in academia without any of the benefits. Little wonder, then, that many of the best in the community choose to work in industry.

An alternative path for RSEs is to work in the University IT department. It’s the path that I took for example. This solves the short term contract issue but brings with it a whole new set of problems. Many IT managers simply don’t understand the value that an RSE can bring to a University. You can sum the issue up with the observation ‘Academics rarely complain to the head of IT that there’s no one around who can optimise their MATLAB code but they complain very quickly when MATLAB doesn’t work on the University managed desktop’. So, guess what I’d get assigned to?

We’ve established that RSEs aren’t ‘normal academics’ and they aren’t ‘normal IT support’ either so where do we fit? I’m trying to help figure that out and help provide an environment where RSEs can not only exist but thrive.

You’ve recently won an EPSRC RSE Fellowship! Can you give a brief overview of your project?

I aim to improve the research software of the ‘long tail scientist’. This term, attributed to Jim Downing of the Unilever Centre for Molecular Informatics – refers to the large number of small research units who perform a huge amount of research. Often, these small research units only have one or two people in them. They aren’t “big science” but there are LOTS of them!

Much of this research involves the generation of code by relatively untrained and inexperienced programmers. This code can benefit greatly from input by RSEs. An experienced RSE can, with relatively little effort, significantly enhance the quality and efficiency of such code whilst simultaneously providing training for the researcher who wrote it. For examples of what I mean, see my Testiminonials page.

I will improve scientific software efficiency, sustainability and reproducibility at the University of Sheffield, by working alongside researchers on their research code in a consultative manner. Rather than working prescriptively, my approach is based on offering and implementing a series of nudges. Nudges are interventions that alter people’s behaviour in a predictable way without forbidding any options. In the context of research software, example nudges might include ‘Learn to write idiomatically in your language of choice – it can lead to faster execution’, ‘See how unit tests allow us to make changes with confidence’ or ‘Using version control, we always know which code produced what result’.

The gulf between the computing scientific “elite” and those emailing spreadsheets is growing and I aim to close that gap.  One researcher I worked with recently said ‘You provide the next step after we’ve been on a Software Carpentry course.’ and I think that describes what I’m trying to do quite well.

How long did it take you to write your Fellowship application? Any other thoughts/advice on the application process?

Writing my fellowship application was one of the most difficult writing exercises I’ve ever undertaken! It took just over a month to write and during that time I did very little else. It took up my days, my evenings, my weekends, my every waking thought. It even consumed my dreams. It was exhausting!

Something that surprised me was the number of people who I needed to help make it happen. Fellowships are often made out to be very individual things but my application involved work by over 40 people! This includes university administrators at all levels, project partners, advisors and mentors. I had to navigate areas of University life and systems that were completely alien to me. There is no way I could have done it alone.

It is essential to get institutional support for your application. At the most basic level, you need a manager who is happy for you to go AWOL for a few weeks. At a higher level, you need to be able to demonstrate to the funding body that your University is fully behind you and your project.

You also need to be emotionally resilient. I poured my heart and soul into my first draft and the feedback from one of my advisors was ‘Well, you solved the blank-page problem.’ That was the only positive thing she had to say! Everything else was a tearing apart. It was brutal! I think I might have cried a little.

Every time I did a rewrite, my mentors found more weaknesses and beat up on me a little. This feedback was essential and made the application so much stronger. As such, I think one piece of advice I’d give is ‘Find mentors you trust who are going to be crushingly hard on you’.  It’ll hurt but nowhere near as much as the comments of Reviewer 2 ;)

Who are your project partners?

My style of working is extremely collaborative. As such I have a lot of formal project partners: The Software Sustainability Institute, The University of Manchester, UCL, Microsoft Research, Dassault Systèmes, Wolfram Research, Mathworks, The N8 Research Partnership, and NAG.

Sheffield has two EPSRC RSE fellows and we’ve teamed up to form the Sheffield Research Software Engineering group. We’ve only existed for a month! At the moment its just us but we have funds to recruit a few more people so watch this space.

Which programming languages and technologies do you regularly use?

I don’t get to choose what languages I use — the researchers I support do that for me. As such, I’m doing a lot of MATLAB, Python and R these days. For compiled languages, I tend to use either C or C++. There’s also some Mathematica and Maple sprinkled here and there.

I help support Sheffield’s High Performance Computing Service so also do a reasonable amount of Bash scripting and parallel computing.

Are there any languages/technologies that you used to use a lot but have now moved away from? Why?

I used to use Fortran back in the day but don’t seem to need it much now — it’s been a long time since I did a project with it. A couple of groups are offering to teach ‘Modern Fortran’ for us at Sheffield so perhaps I’ll take another look?

I used to like Perl, and even taught a one-day course on it 10 years ago but I strongly prefer Python and so, it seems, do the researchers I support.

Is there anything on your ‘to-learn’ list?

• Cloud computing: I’ve started doing some small projects using Amazon EC2 and feel very much a newbie at the moment. I can figure out how to do things but am not sure if what I am doing is good practice or not.
• Docker: I understand the basics but am yet to figure out how to use the technology properly for research.
• Julia: I played with it a little a few years ago and really like it. There’s a lot of buzz around the language. No one has come to me with a Julia problem yet but I think its just a matter of time.
• Modern OpenMP: I learned OpenMP a long time ago. It’s time for an update.

## Nah…exams aren’t being dumbed down at all!

August 14th, 2009

I have to admit that when I first read this story I had to check that the date wasn’t April 1st but alas it is genuine.  The AQA (Assessment and Qualifications Alliance) in the UK saw fit to award a teenage boy a certificate for using one of his local buses.  The student in question didn’t even know he was being examined until he received his certificate in the post.

Yes, Bobby McHale from Bury in Greater Manchester is now the proud recipient of an AQA qualification called ‘Using Public Transport (Unit 1)‘ where he demonstrated the following skills

1. walk to the local bus stop;
2. stand or sit at the bus stop and wait for the arrival of a public bus;
3. enter the bus in a calm and safe manner;
4. be directed to a downstairs seat by a member of staff;
5. sit on the bus and observe through the windows;
6. wait until the bus has stopped, stand on request and exit the bus;

They don’t just dish this qualification out to anyone though….Bobby’s brother failed!

## 2nd call for Carnival of Maths submissions

December 27th, 2008

If you didn’t get around to submitting anything for the next carnival of maths then now is your chance! I’m delaying the publication until Monday so please post your submissions via the comments section.

## Christmas Silliness

December 18th, 2008

As we get closer and closer to the holiday season I am finding it increasingly difficult to focus on anything productive. Here are a couple of silly links to help bring your productivity levels down to mine.

What do you plan on sending your true love this Christmas? Whether it’s a partridge in a pear tree or five gold rings, the credit crunch affects them all – see how in  True loves to face most expensive Christmas ever.

If programming languages were religions then what would they be?  Apparently Perl would be voodoo and having used Perl for a number of years I would have to agree :)  Comparing Visual Basic to Satanism might be a little harsh though.  I wonder what Mathematica or MATLAB would be?

## New version of the NAG Fortran compiler released

December 4th, 2008

Almost 10 years ago now, I was a teaching assistant for an Introduction to Fortran course at the University of Sheffield.  I remember being told by one of the other PhD students that ‘Fortran is a dying language and so we are wasting our time teaching this stuff.  No one will be using Fortran in 10 years time.’

Fast forward 10 years and Fortran is still going strong in the research and numerical communities.  If you are doing numerics and you want fast code then Fortran is an option that you simply can’t ignore. It is also required for doing things like writing user defined materials in the finite element analysis package, Abaqus.   One of the best Fortran compilers on the market (in my opinion at least) is the NAG Fortran Compiler and their Linux version has recently been updated to version 5.2.

It now includes support for almost all of the features in the Fortran 2003 standard and they have also added quadruple precision support – something that the people I support have wanted for a long time.  A full list of changes can be found on NAG’s website.

Finally a note to self – they have changed the name of the executable from f95 to nagfor – this will generate support queries…you know it will!

## Santa Claus: An Engineer speaks!

December 3rd, 2008

I have no idea where the following joke originated but you can find it all over the web. At least one person disagrees with the calculations ;)

1. No known species of reindeer can fly, but there are 300,000 species of living organisms yet to be classified, and while most of these are insects and germs, this does not completely rule out flying reindeer which only Santa has seen.

2. There are 2 billion children in the world (persons under 18). But since Santa doesn’t (appear) to handle Muslim, Hindu, Jewish, or Buddhist children, that reduces the workload by 85% of the total, leaving 378 million according to the Population Reference Bureau. At an average (census) rate of 3.5 children per household, that’s 91.8 million homes. One presumes there is at least one good child per house.

3. Santa has 31 hours of Christmas to work with, thanks to the different time zones and the rotation of the earth, assuming he travels east to west (which seems logical). This works out to 822.6 visits per second. This is to say that for each Christian household with good children, Santa has 1/1000th of a second to park, hop out of the sleigh, jump down the chimney, fill the stocking, distribute the remaining presents under the tree, eat whatever snacks have been left, get back up the chimney, get back into the sleigh and move on to the next house. Asusming that each of these 91.8 million stops are evenly distributed around the earth, which, of course, we know to be false, but for the purposes of our calculations we will accept, we are now talking about 0.78 miles per household, for a total trip of 75.5 million miles, not counting stops to do what most of us do at least once every 31 hours, plus feeding, etc. That means that Santa’s sleigh is moving at 650 miles per second, 3000 times the speed of sound. For purposes of comparison, the fastest man-made vehicle on earth, the Ulysses space probe, moves at a poky 27.4 miles per second – a conventional reindeer can run, at tops 15 miles per hour.

4. The payload on the sleigh adds another interesting element. Assuming each child gets nothing more than a medium sized lego set (2 pounds), the sleigh is carrying 321,300 tons, not counting Santa, who is invariably described as overweight. On land, conventional reindeer can pull no more than 300 pounds. Even granting the “flying reindeer” can pull TEN TIMES the normal amount, we cannot do the job with eight, or even nine. We need 214,200 reindeer. This increased the payload – not even counting the weight of the sleigh to 353,430 tons. Again, for comparison – this is four times the weight of the Queen Elizabeth – 5,353,000 tons travelling at 650 miles per second creates enormous air resistance. This will heat the reindeer up in the same fashion as spacecrafts re-entering the earth’s atmosphere. The lead pair will absorb 14.3 quintillion joules of energy per second each. In short, they will burst into flames almost instantaneously, exposing the reindeer behind them, and creating a deafening sonic boom in their wake. The entire reindeer team will be vaporized with 4.26 thousandths of a second. Santa, meanwhile, will be subject to centrifugal forces 17,500.06 times greater than gravity. A 250 pound Santa (which seems ludicrously slim) would be pinned to the back of the sleigh by a 4,315,015 pound force.

## Proof I have a brain!

November 27th, 2008

Many people have referred to me as brainless over the years so when offered the opportunity to prove otherwise at the International Mathematica Symposium earlier this year I jumped at the chance.

Apparently the reason for the fuzziness in the image was because I moved around too much during the scan. This could have been fixed but it would have involved bolting my skull into position and, although the crowd were interested in a demonstration, I was not really that up for it.

If I remember the details correctly, this is a low resolution scanner that can be used by surgeons while brain surgery is actually taking place. Although a surgeon would have access to very high resolution scans, taken before surgery begins, these would steadily become inaccurate due to changes made to the brain during surgery.

Thanks to Barrie Stokes for the pictures.

## Nominations for the 2008 Eddies

November 26th, 2008

A friend of mine pointed out that I haven’t yet made any nominations for this year’s Edublog awards so here they are.

My nomination for the Best Resource Sharing Blog 2008 is The Teaching College Math Technology Blog which is written by Maria Andersen.  The amount of technology that can be applied to teaching in mathematics is truly staggering from Mathematica through to technologies such as GraphPad, Windows 7, Interactive Tables and Jing.  Maria not only informs us of these technologies but she actually uses them in her teaching and reports on the results – what works, what doesn’t, what is useful and what could be made better.  With so much cool stuff to play with, it almost makes me wish I was a Maths teacher.

My other nomination is for the Best Group Blog 2008 and the nominee is 360.  360 is an unofficial Blog of the Nazareth College Math Department in Rochester, New York and it offers a wide range of interesting mathematical tidbits.  I find myself returning to 360 again and again since they manage to simulatenously entertain and inform on the subject of mathematics.  For example – in the last month alone they have written articles on (among other things) Copernicus, geometry puzzles, kenKen, Pythagorean Triples and how fourier transforms were used to answer questions about a song by the Beatles.  Quite simply – one of my favourite maths blogs.

## Crayon Physics Deluxe Available for pre-order

November 14th, 2008

Earlier this year I wrote about a wonderful freeware game called Crayon Physics which was a demo version of an in-development game called Crayon Physics Deluxe.  The game’s developer, Petri Purho, wrote the demo in just one week and it caused a small sensation on the internet (and in my office for that matter).

Since then, Petri has been hard at work on the full version of Crayon Physics Deluxe and has been constantly bothered by the game’s fans as to when it will be ready.  Petri’s answer is one I approve of – ‘It will be released when it’s done.’  The good news is that Petri feels that it will be done sooner rather than later and has recently set up a site to take pre-orders for the software at the discounted price of $14.95 during Novemer 2008 (the full price will be$20).

Petri – my pre order will be with you at some point over the next few days.  Physics has never looked so fun!