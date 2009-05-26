Did you know that your graphics card is effectively a mini-supercomputer? Your main CPU (Central Processing Unit) probably has 2 processor cores, 4 if you are lucky but a high end graphics card can have as many as 96 GPUs (Graphics Processing Units) – which is a lot. Even my laptop’s relatively low end NVIDIA GeForce 8400MS has 16 ‘stream processors‘ according to this Wikipedia page.

The large number of cheap processor cores is the good news. The bad news is that they are not as capable as fully fledged Intel or AMD processor cores since, as you might expect, the cores in your graphics card are rather specialised. They were designed specifically to do the mathematics behind graphics processing and they do this very well indeed but until fairly recently it was rather difficult to get them to do much else.

That hasn’t stopped people from trying though. Some time ago,NVIDIA, the makers of my laptop’s graphics card, released a software library called CUDA which enables C-programmers to access the vast computational power locked away in a typical pixel pusher. The results have been nothing short of astonishing. One developer, for example, recently demonstrated how to use CUDA to calculate the properties of the Ising Model (A staple in undergraduate computational physics courses) over 60 times faster than a single, bog standard Intel CPU.

If you are impressed with a factor-60 speed up then the 675 times speed up reported by Michał Januszewski and Marcin Kostur is really going to knock your socks off. Yep – that’s not a typo. They have written code that can solve certain Stochastic Differential Equations SIX HUNDRED AND SEVENTY FIVE TIMES FASTER than a single, standard CPU core. Your shiny new dual quad-core workstation isn’t looking so clever now is it? Not bad for technology designed for playing the latest version of quake on.

This is all well and good but I don’t have the time or the mental stamina to code in C anymore. What I want is for all of my favourite Mathematica, MATLAB or Python functions to be CUDA-ised so that I can enjoy a big speed up at low cost and low programming effort. I’ll take the moon on a stick while I’m at it if you don’t mind.

Well, it seems that some people are doing exactly this. I have just stumbled across a project called GPUmat which claims to offer up to 40x speedup with very little effort on the part of the user. One example they give considers the following standard MATALB code.

A = single(rand(100)); % A is on CPU memory B = single(rand(100)); % B is on CPU memory C = A+B; % executed on CPU. D = fft(C); % executed on CPU

To get this running on your graphics card all you need to do is (after you’ve installed the toolbox and CUDA of course)

A = GPUsingle(rand(100)); % A is on GPU memory B = GPUsingle(rand(100)); % B is on GPU memory C = A+B; % executed on GPU. D = fft(C); % executed on GPU

Very nice. I’m not sure what MATLAB functions are supported but I guess it’s all there in the documentation – I just haven’t had time to look through it. I’d love to tell you what sort of speed-up I experienced when I tried it out but, unfortunately, the developers are asking for all potential users to register before they get access to the downloads and that put me off a bit

It’s all free though so if you’d like to check it out yourself, and you don’t mind the registration, then head over to the developer’s website. I’d love to hear how you get on.

PS: Make sure your graphics card is CUDA compatible first though. You’ll waste a lot of time trying out this software if it isn’t!