First ever MATLAB-based Software Carpentry boot camp
What is Software Carpentry?
Software Carpentry boot-camps aim to ensure that researchers have a working knowledge of several useful technologies from the world of software development. Concepts such as version control, unit-testing, task-automation and modular programming are bread and butter to full time software-developers but are often completely unknown to many researchers; researchers who are nevertheless expected to develop computer code as part of their research output.
An in-depth education in all of these technologies can take a lot of time; time that many researchers simply can’t spare. Fortunately, however, it is possible to learn just enough to completely transform the quality of your workflow in a relatively short amount of time. Software Carpentry boot camps aim to lay the foundations in around two days.
Software Carpentry…but in MATLAB
Traditionally, these bootcamps are taught using open-source languages such as Python or R but many proprietary languages are also used in academia such as MATLAB, IDL, Mathematica and Maple to name but a few. In the most recent version of MATLAB, Mathworks introduced a unit testing framework and so now seemed like a good time to try out Software Carpentry in MATLAB.
On Tuesday 14th January I hosted the first ever MATLAB-based Software Carpentry bootcamp in conjunction with The Software Sustainability Institute (of which I am a 2013 fellow) and Mathworks. Held at The University of Manchester, this two-day event gave free software development instruction to researchers from a wide variety of disciplines and, if the feedback forms are to be believed, was a resounding success.
Shoaib Sufi and Aleksandra Pawlik of the Software Sustainability Institute taught material on bash shell scripting and git version control respectively with Ken Deeley of Mathworks providing MATLAB instruction. I, along with Mathworks’ Jos Martin and Juan Martinez, acted as the less-than-glamourous but ever-helpful classroom assistants.
Fun and games with BYOD – Licensing
For the decade or so that I’ve been teaching programming, I’ve done so in fully equipped teaching labs containing row upon row of identical computers where each have all the required software pre-installed. For this event, we decided to try a Bring Your Own Device (BYOD) approach..i.e. students bring their own, personal laptops and we provided a list of required software that needed to be installed for the event. Since Manchester’s MATLAB site license is network-based only and does not allow students to install on personally owned equipment, Mathworks kindly supplied standalone trial licenses for all course attendees.
BYOD has a number of benefits for the student and, in my mind at least, the most important of these benefits is that students are left with a fully working development environment after the course. This means that they can start applying their new skills immediately after leaving the course which hopefully leads to them being used ‘in anger’ on their research projects. Since the students were only supplied with trial licenses, this was not to be the case with MATLAB. They will have to switch to using their on-campus machines or purchase their own copy of ‘standalone’ MATLAB in order to continue working.
Availability of software is not something that usually concerns an organiser of a Software Carpentry boot camp since the likes of Python and R are available everywhere for free. When using a proprietary language such as MATLAB, however, it’s very much of an issue and I’d advise any future organiser of a MATLAB boot camp to carefully consider this before proceeding. Of course this isn’t just true of MATLAB, it’s true of any proprietary language one may choose. It will also be less of a problem if your institution has an all you can eat, ‘Total Academic Headcount’ unlimited site license.
Fun and games with BYOD – switches and glitches
All three major desktop operating systems were represented in the laptops of the 20 or so students — something that occasionally made for fun times. Here is a list of some of the minor issues that arose over the two days
- Teaching bash scripting to Windows users always makes me wince a little. Environments such as Cygwin and git bash are great and a little bit of bash never hurt anyone but I can’t help but wonder if we should be teaching a native scripting language instead such as PowerShell. After all, Linux users would be surprised if they came to a scripting seminar and we made them use pash! Of course, if the student ever gets access to a HPC system, it is highly probable that it will be running some variant of Linux and so perhaps everyone should learn at least a little bash.
- One Mac user had some fancy graphical overlay program which jazzed up his desktop. Looked great but it turned out that it sometimes caused MATLAB to crash.
- Some of the MATLAB commands that interacted with the file system worked slightly differently across the three operating systems…something we hadn’t appreciated ahead of time. This caused delays while we figured out a platform-independent way to proceed.
- Some of the Linux machines exposed a bug in TLS (thread local storage) in Intel’s MKL which caused errors in MATLAB.
- MATLAB’s keyboard shortcuts are different across operating systems. It is possible to change the behaviour but MATLABers who’ve been around a while didn’t want to. This sometimes led to confusion if the instructor mentioned an explicit keyboard shortcut that happens to be different on the students machine.
None of these issues were particularly major but the need to consider and resolve them did take up the time of instructors and demonstrators. In a lab-setting, this extra work would not be necessary. Obviously we’ll fix some of the above issues in the next iteration of the course but I believe that BYOD sessions will always require a higher number of demonstrators/glamorous assistants than more traditional lab-sessions simply because of the inevitable variation of software and hardware.
I can see my Mathworks friends rolling their eyes already but I have to get this off my chest. Regular readers of this blog will know that I have got some issues with Mathworks’ toolbox system in MATLAB. I *really* wanted a software carpentry course that only included pure MATLAB–no toolboxes at all. In the event, we decided on a set of course materials that required the use of the statistics toolbox because it made certain things so much easier.
For example, MATLAB uses NaNs to represent missing data — a design choice that quickly leads to the desire for functions such as nanmean and nanmedian. Unfortunately, despite their simplicity, these functions are in the statistics toolbox and so using them leads to less accessible software (since your users need to have that extra toolbox). Of course you could code up your own versions, or use free implementations easily enough but then you are not using idiomatic MATLAB. All very frustrating. This wasn’t the only reason why we added the statistics toolbox to the list of requirements of course but hopefully makes my point.
OK, moan over, I’m done….for now.
The main course page is at http://apawlik.github.io/2014-01-14-manchester/ and the course material for version control using git and shell scripting using bash are already available. We hope to have the MATLAB material (which was developed by Mathworks) available in the near future once we’ve sorted out some legal issues.
The tutorials were very interactive with the instructors demonstrating commands while students followed along on their own machines–barely a slide to be seen, just how I like it. There were regular exercises with a high ratio of demonstrators to students. This was very much a hands-on course which, in my opinion, is the best way to learn programming.
One of the advantages of instructor-led courses is that students can go off-piste if they wish and learn all sorts of extra material. One student, for example, asked senior Mathworks developer, Jos Martin, for a quick code-review of her research simulation at the end of the first day. Listening to Jos’ critique of the code was instructive for several of us. Jos is also the lead developer of MATLAB’s parallel computing toolbox — something I took advantage of by asking him questions relevant to a code-optimization project I’m currently working on.
It’s very difficult to replicate this kind of interaction in online courses!
Useful teaching technology
The instructors used a couple of pieces of software that I believe significantly enhanced the teaching experience and I plan to use them in future courses of my own.
- ZoomIt - This is a free presentation tool that allows you to zoom into any area of the screen and subsequently annotate it. This might not sound like much but it can really improve a presentation. Here’s a video of some of its functionality.
- Etherpad – Etherpad is a web based collaborative document editor which we used to post code snippets, links and anything else that anyone felt was useful on the day. We also used it as a chat room which was sometimes useful.
- PostIt notes – Very low tech but very effective. Every student had red and green sticky notes. If a student had no sticky note on the back of their laptop, it meant they were working and would prefer to be left alone. Green means that they had completed the exercise and red (or in our case, orange) means ‘I want help’.
Feedback and the future
Both Mathworks and the Software Sustainability Institute asked students to fill out their own course feedback forms. Since the students had had such a great experience, they were all happy to do so but whittling those two feedback forms down to one that satisfies both institutions is something we should definitely aim for! The feedback was extremely positive with almost everyone agreeing that they had got a lot out of the course.
Of course, it wasn’t perfect and there are several things we could do to make it better for next time:-
- Some of the ‘Fun and Games with BYOD’ described earlier could have been avoided by modifying the course material a little.
- The individual sections on bash, git and MATLAB were great but I felt that they needed to be tied together better. They felt too much like self-contained mini-courses and it wouldn’t take much extra effort on our part to link them together a little more.
- The course was free and places were strictly limited. A couple of people cancelled less than 24 hours before the first day which didn’t give us enough time to offer the places out to others. Worse still, a few more people simply didn’t bother to turn up and didn’t offer any explanation. I find that this often happens with free courses and I’ve yet to figure out a way to improve the situation.
- We need to get all of the course material on line!
All told, the course was a great success and I hope to be running more of them in the future. A huge thanks to the instructors, Shoaib Sufi, Alexandra Pawlik and Ken Deeley who did the lion’s share of the hard work on the day but also to Mathworks’ Jos Martin and Juan Martinez who did a great job of helping out students with the exercises, answering questions and generally ensuring that everything ran as smoothly as possible. I’d also like to thank Mathworks’ Stasi Revel and Tanya Morton who helped out with sorting out trial licenses and, finally, thanks to Software Carpentry’s Greg Wilson who gave support and advice in the weeks leading to the event.