EPSRC Research Software Engineering Fellow: Mike Croucher
In this article, I find myself in the rather odd position of interviewing myself as part of my series of interviews with the new cohort of EPSRC Research Software Engineering Fellows.
Could you tell us a little about yourself and how you became a Research Software Engineer?
I completed a PhD in theoretical physics in 2005 at The University of Sheffield where my area of research was photonic crystals. The most important thing I learned during my PhD is that I was a lot better at solving computational problems than I was at Physics. In particular, I seemed to be a lot better at solving other people’s problems rather than inventing and solving my own.
This led me to consider a job at The University of Manchester in the centralised Applications Support Team with IT Services. This team looked fantastic! Its role was to support Manchester’s extensive site licensed application portfolio – MATLAB, Mathematica, Intel Compilers, SPSS, Abaqus…that sort of thing. It included aspects of licensing, sysadmin, installation support, high performance computing, consultancy, teaching – you name it! Sadly, 6 months after I started at Manchester, there was the first of many IT department restructures and the team was disbanded.
The centralised service was devolved into the faculties (It’s centralised again now!) and I ended up in the faculty of Engineering and Physical Sciences. I took the responsibility of supporting a portfolio of applications with me. Broadly speaking, I became ‘the MATLAB guy’ at Manchester but also started extending support into the open source world — Python, R, Octave and so on. This was done organically: I went were the work took me. If lots of academics came with problems in foo, I learned about foo, solved problems in foo and started teaching how to do foo better. If ‘foo’ happened to be distasteful to me — such as VBA — well, tough! To be successful in support, you must do the job that’s put in front of you. It’s a little like an Accident and Emergency department.
Very early on, I learned that the path to a researcher’s heart was speed. They wanted results and they wanted them fast! There was a team at Manchester doing hardcore Fortran/MPI stuff and they had that market sewn up — there was no space for the likes of me there. So, I took advantage of being ‘The MATLAB guy’ and started offering programming services to whoever wanted them for free. I could often achieve 100x speed-ups with less than a day’s work and this made me pretty popular!
I got to see a lot of code and that’s where the problems started. I’d get code with names like phdresults_dec2006_ver12_broken_fixed_FORMIKE.m that wouldn’t run on my machine. I’d learn that my machine was the second machine it had ever been run on and I was the second person to ever see the code. There would be 1000s of lines of code with no version control and no tests which made refactoring scary!
I learned that a huge amount of computational research was being done inefficiently. I feel that I need to be very clear: when I say ‘inefficient’, I’m not referring to Fortran code, say, that’s not making optimal use of the cache, SIMD instructions or has poor scaling over 128 cores. When I was young, this is where I thought the work would be. Sure, there’s some, but that’s not where you can help the most people.
A much more common situation, I find, is a researcher who’s workflow includes a huge amount of manual work. PhD students (and in one notable case, a very senior professor!) who manually edit 100s of spreadsheets for example. A morning’s work spent on some simple automation can completely change their lives!
These experiences got me interested in how to improve the general level of software engineering practice in research. I became a Software Sustainability Institute fellow in 2013, discovered this huge, amazing community and the rest is history.
You recently left Manchester University to move to Sheffield? What was behind that?
Prof. Neil Lawrence met me in The Sheffield Tap one evening and said ‘How would you like to ditch your commute and change the world?’ He was interested in bringing some of the research software initiatives I’d worked on in Manchester over to Sheffield as part of his Open Data Science initiative.
Changing the world sounded interesting but he had me at ‘ditch the commute’ if I’m being honest. I’ve lived in Sheffield for 20 years and commuted to Manchester by train for 10 of those. On some days, my twitter feed @walkingrandomly degenerated into little more than rants against various train companies! I needed to stop.
Working with Neil and the University of Sheffield has been an amazing experience. There’s a vibrancy here that’s infectious and a strong desire to do better in Research Software. That Sheffield won 2 of the 7 Research Software Engineering fellowships on offer was like a dream come true. The other Sheffield RSE fellow is Paul Richmond and we’ve joined forces to provide the strongest research software service we can to The University of Sheffield.
What do you think is the role of a Research Software Engineer?
I’m going to lift the answer to this straight out of my fellowship application.
Technological development in software is more like a cliff-face than a ladder – there are many routes to the top, to a solution. Further, the cliff face is dynamic – constantly and quickly changing as new technologies emerge and decline. Determining which technologies to deploy and how best to deploy them is in itself a specialist domain, with many features of traditional research.
Researchers need empowerment and training to give them confidence with the available equipment and the challenges they face. This role, akin to that of an Alpine guide, involves support, guidance, and load carrying. When optimally performed it results in a researcher who knows what challenges they can attack alone, and where they need appropriate support. Guides can help decide whether to exploit well-trodden paths or explore new possibilities as they navigate through this dynamic environment.
These guides are highly trained, technology-centric, research-aware individuals who have a curiosity driven nature dedicated to supporting researchers by forging a research software support career. Such Research Software Engineers (RSEs) guide researchers through the technological landscape and form a human interface between scientist and computer. A well-functioning RSE group will not just add to an organisation’s effectiveness, it will have a multiplicative effect since it will make every individual researcher more effective. It has the potential to improve the quality of research done across all University departments and faculties.
Are there any downsides to being a Research Software Engineer?
Something I’ve learned from conducting these interviews is that there are several different types of ‘Research Software Engineer’. We are not a ‘one size fits all’ community! I think that one thing we all have in common is that we don’t fit the normal ‘money-in, papers-out’ model of many academics. This was brought up in Louise Brown’s interview and it strongly resonates with me. This situation makes it difficult for us to follow an academic-like career path.
It is extremely difficult, for example, to get promoted as a research programmer without attempting to become something you are not. Worse, it’s difficult to simply get a permanent job! Many RSEs are on short term contracts with low salaries. In short, you get much of the grief of working in academia without any of the benefits. Little wonder, then, that many of the best in the community choose to work in industry.
An alternative path for RSEs is to work in the University IT department. It’s the path that I took for example. This solves the short term contract issue but brings with it a whole new set of problems. Many IT managers simply don’t understand the value that an RSE can bring to a University. You can sum the issue up with the observation ‘Academics rarely complain to the head of IT that there’s no one around who can optimise their MATLAB code but they complain very quickly when MATLAB doesn’t work on the University managed desktop’. So, guess what I’d get assigned to?
We’ve established that RSEs aren’t ‘normal academics’ and they aren’t ‘normal IT support’ either so where do we fit? I’m trying to help figure that out and help provide an environment where RSEs can not only exist but thrive.
You’ve recently won an EPSRC RSE Fellowship! Can you give a brief overview of your project?
I aim to improve the research software of the ‘long tail scientist’. This term, attributed to Jim Downing of the Unilever Centre for Molecular Informatics – refers to the large number of small research units who perform a huge amount of research. Often, these small research units only have one or two people in them. They aren’t “big science” but there are LOTS of them!
Much of this research involves the generation of code by relatively untrained and inexperienced programmers. This code can benefit greatly from input by RSEs. An experienced RSE can, with relatively little effort, significantly enhance the quality and efficiency of such code whilst simultaneously providing training for the researcher who wrote it. For examples of what I mean, see my Testiminonials page.
I will improve scientific software efficiency, sustainability and reproducibility at the University of Sheffield, by working alongside researchers on their research code in a consultative manner. Rather than working prescriptively, my approach is based on offering and implementing a series of nudges. Nudges are interventions that alter people’s behaviour in a predictable way without forbidding any options. In the context of research software, example nudges might include ‘Learn to write idiomatically in your language of choice – it can lead to faster execution’, ‘See how unit tests allow us to make changes with confidence’ or ‘Using version control, we always know which code produced what result’.
The gulf between the computing scientific “elite” and those emailing spreadsheets is growing and I aim to close that gap. One researcher I worked with recently said ‘You provide the next step after we’ve been on a Software Carpentry course.’ and I think that describes what I’m trying to do quite well.
How long did it take you to write your Fellowship application? Any other thoughts/advice on the application process?
Writing my fellowship application was one of the most difficult writing exercises I’ve ever undertaken! It took just over a month to write and during that time I did very little else. It took up my days, my evenings, my weekends, my every waking thought. It even consumed my dreams. It was exhausting!
Something that surprised me was the number of people who I needed to help make it happen. Fellowships are often made out to be very individual things but my application involved work by over 40 people! This includes university administrators at all levels, project partners, advisors and mentors. I had to navigate areas of University life and systems that were completely alien to me. There is no way I could have done it alone.
It is essential to get institutional support for your application. At the most basic level, you need a manager who is happy for you to go AWOL for a few weeks. At a higher level, you need to be able to demonstrate to the funding body that your University is fully behind you and your project.
You also need to be emotionally resilient. I poured my heart and soul into my first draft and the feedback from one of my advisors was ‘Well, you solved the blank-page problem.’ That was the only positive thing she had to say! Everything else was a tearing apart. It was brutal! I think I might have cried a little.
Every time I did a rewrite, my mentors found more weaknesses and beat up on me a little. This feedback was essential and made the application so much stronger. As such, I think one piece of advice I’d give is ‘Find mentors you trust who are going to be crushingly hard on you’. It’ll hurt but nowhere near as much as the comments of Reviewer 2 ;)
Who are your project partners?
My style of working is extremely collaborative. As such I have a lot of formal project partners: The Software Sustainability Institute, The University of Manchester, UCL, Microsoft Research, Dassault Systèmes, Wolfram Research, Mathworks, The N8 Research Partnership, Maplesoft and NAG.
Tell me about your RSE group.
Sheffield has two EPSRC RSE fellows and we’ve teamed up to form the Sheffield Research Software Engineering group. We’ve only existed for a month! At the moment its just us but we have funds to recruit a few more people so watch this space.
Which programming languages and technologies do you regularly use?
I don’t get to choose what languages I use — the researchers I support do that for me. As such, I’m doing a lot of MATLAB, Python and R these days. For compiled languages, I tend to use either C or C++. There’s also some Mathematica and Maple sprinkled here and there.
I help support Sheffield’s High Performance Computing Service so also do a reasonable amount of Bash scripting and parallel computing.
Are there any languages/technologies that you used to use a lot but have now moved away from? Why?
I used to use Fortran back in the day but don’t seem to need it much now — it’s been a long time since I did a project with it. A couple of groups are offering to teach ‘Modern Fortran’ for us at Sheffield so perhaps I’ll take another look?
I used to like Perl, and even taught a one-day course on it 10 years ago but I strongly prefer Python and so, it seems, do the researchers I support.
Is there anything on your ‘to-learn’ list?
- Cloud computing: I’ve started doing some small projects using Amazon EC2 and feel very much a newbie at the moment. I can figure out how to do things but am not sure if what I am doing is good practice or not.
- Docker: I understand the basics but am yet to figure out how to use the technology properly for research.
- Julia: I played with it a little a few years ago and really like it. There’s a lot of buzz around the language. No one has come to me with a Julia problem yet but I think its just a matter of time.
- Modern OpenMP: I learned OpenMP a long time ago. It’s time for an update.