Search Results

April 28th, 2015

I recently found myself in need of a portable install of the Jupyter notebook which made use of a portable install of R as the compute kernel. When you work in institutions that have locked-down managed Windows desktops, such portable installs can be a life-saver! This is particularly true when you are working with rapidly developing projects such as Jupyter and IRKernel.

It’s not perfect but it works for the fairly modest requirements I had for it. Here are the steps I took to get it working.

Download and install Portable Python

I downloaded Portable Python from and installed into a directory called Portable Python

Update IPython and install the extra modules we need

This version of Portable Python comes with a portable IPython instance but it is too old to support alternative kernels. As such, we need to install a newer version.

Open a cmd.exe command prompt and navigate to Portable Python\App\Scripts.

Enter the command

easy_install ipython.exe

You’ll now find that you can launch the ipython.exe terminal from within this directory:

C:\Users\walkingrandomly\Desktop\Portable Python\App\Scripts>ipython
Python 2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)]
Type "copyright", "credits" or "license" for more information.

IPython 3.1.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: exit()

If you try to launch the notebook, however, you’ll get error messages. This is because we haven’t taken care of all the dependencies. Let’s do that now. Ensuring you are still in the Portable Python\App\Scripts folder, execute the following commands.

easy_install pyzmq
easy_install jinja2
easy_install tornado
easy_install jsonschema

You should now be able to launch the notebook using

ipython notebook

Install portable R and IRKernel

  • I downloaded Portable R 3.2 from and installed into a directory called R-Portable
  • Move this directory into the Portable Python directory. It needs to go inside Portable Python\App (see this discussion to learn how I discovered that this location was the correct one)
  • Launch the Portable R executable which should be at Portable Python\App\R-Portable\R-portable.exe and install the IRKernel packages by doing
install.packages(c("rzmq","repr","IRkernel","IRdisplay"), repos="")

Install additional R packages

The version of Portable R I used didn’t include various necessary packages. Here’s how I fixed that.

  • Launch the Portable R executable which should be at Portable Python\App\R-Portable\R-portable.exe and install the following packages 

Install the R kernel file
Create the directory structure Portable Python\App\share\jupyter\kernels\R_kernel

Create a file called kernel.json that contains the following

{"argv": ["R-Portable/App/R-Portable/bin/i386/R.exe","-e","IRkernel::main()",
 "display_name":"Portable R"

This file needs to go in the R_kernel directory created earlier. Note that the kernel location specified in kernel.json uses Linux style forward slashes in the path rather than the backslashes that Windows users are used to. I found that this was necessary for the kernel to work –it was ignored by the notebook otherwise.

Finishing off

Everything created so far, including R, is in the folder Portable Python 2.7.6

I created a folder called PortableJupyter and put the Portable Python 2.7.6 folder inside it. I also created the folder PortableJupyter\notebooks to allow me to carry my notebooks around with the software that runs them.

There is a bug in Portable Python relating to scripts like IPython.exe that have been installed using easy_install. In short, they stop working if you move the directory they’re installed in – breaking portability somewhat! (Details here)

The workaround is to launch Ipython by running the script Portable Python\App\Scripts\

I didn’t want to bother with that so created a shortcut in my PortableJupyter folder called Launch notebook. The target of this shortcut was the following line

%windir%\system32\cmd.exe /c "cd notebooks && "%CD%/Portable Python\python.exe" "%CD%/Portable Python\App\Scripts\" notebook"

This starts the notebook using the default web browser and puts you in the notebooks directory.

The pay off

My folder looks like this:


If I click on the Launch Notebook shortcut, I get a Jupyter session with 2 kernel options


I can choose the Portable R kernel and start using R in the notebook!


February 26th, 2015

Environment modules are widely used in the High Performance Computing (HPC) world where sysadmins need to install dozens, or maybe hundreds of potentially conflicting applications, libraries and compilers on multi-user machines. The University of Manchester’s Computational Shared Facility (CSF), for example, makes extensive use of environment modules and would be extremely difficult to run without them.

Once the sysadmin has correctly installed an application (MATLAB 2014a say) and set up the corresponding module file, making it available to your shell is as easy as doing

module load apps/binapps/matlab/R2014a

Unloading the module is just as easy

module unload apps/binapps/matlab/R2014a

On a heavily used, multi-user system environment modules are invaluable! Every user can have whatever compilers, libraries and applications they like — they just load and unload whatever they need from the huge selection supported by their ever-friendly sysadmins.

Environment modules on Ubuntu

I needed to install environment modules on a VM running Ubuntu 14.04 for my own use. I found a very nice setup guide at but it didn’t work. On attempting to compile, I got the error message

cmdModule.c:644:15: error: 'Tcl_Interp' has no member named 'errorLine'

This is a known bug in version 3.2.9c of environment modules and has a work-around.

I also found a set up guide at which had some useful advice on configuration..

Combining information from these sources, I managed to get a working install. Here are the steps I did in full for a clean Ubuntu 14.04 image

#Install the tcl development package
sudo apt-get install tcl-dev

#Make the directories where my modules and packages are going to live
sudo mkdir /opt/modules
sudo mkdir /opt/packages

#Get the source code. This was the most up to date version as of 25th Feb 2015

#unpack and enter source directory
tar xvzf modules-3.2.9c.tar.gz
cd modules-3.2.9

#Configure using the workaround and selecting my module folder 
CPPFLAGS="-DUSE_INTERP_ERRORLINE" ./configure --with-module-path=/opt/modules/

#make and install
sudo make install

#Edit the modulefiles path. Comment out all lines starting /usr so that only /opt/modules is used
sudo sed -i 's~^/usr~#/usr~' /usr/local/Modules/3.2.9/init/.modulespath

#Configure the shell to use modules
sudo tee /etc/profile.d/ > /dev/null << 'EOF'
# system-wide profile.modules #
# Initialize modules for all sh-derivative shells #
trap "" 1 2 3


case "$0" in
-bash|bash|*/bash) . $MODULES/init/bash ;;
-ksh|ksh|*/ksh) . $MODULES/init/ksh ;;
-sh|sh|*/sh) . $MODULES/init/sh ;;
*) . $MODULES/init/sh ;; # default for scripts

trap - 1 2 3

#Add modules to your .bashrc file
echo '#For modules' >> ~/.bashrc
echo '. /etc/profile.d/' >> ~/.bashrc

That takes care of the basic setup but modules is pretty useless at this stage. To make it useful, you need to install some extra software and the corresponding module file.

Installing a module file for Anaconda Python 2.1
This is a really simple example of how to set up a basic module file

I downloaded and installed Anaconda Python 2.1 to /opt/packages and created a file called anaconda2.1 in /opt/modules containing the following

proc ModulesHelp { } {
global dotversion
puts stderr "\tAnaconda Python 2.1 providing Python 2.7.8"
module-whatis "Anaconda Python 2.1"
prepend-path PATH /opt/packages/anaconda/bin

Now, when I do the command

module avail

I get

-------------------------- /opt/modules/ ---------------------------

I can load my anaconda2.1 module with the command

module load anaconda2.1

Now, when I type python at the command prompt, I’ll be using Anaconda’s python rather than the system python. Once I’m done, I can unload with

module unload anaconda2.1

This example is so trivial it’s almost not worth it — modules really come into their own when you need to support loads of compilers and corresponding libraries. There’s an example using gcc at

February 2nd, 2015

I’ve been working at The University of Manchester for almost a decade and will be leaving in just less than 3 weeks time! A huge part of my job was to support a major subset of Manchester’s site licensed application software portfolio so naturally I’ve made use of a lot of it over the years. As of February 20th, I will no longer be entitled to use any of it!

This article is the first in a series where I’ll look at some of the software that’s become important to me and what my options are on leaving Manchester.

Here, I consider Mathematica – a computer algebra system and technical computing environment that I’m very fond of. I’ve been a Mathematica user for over 15 years and yet, suddenly, I find myself license-less! So much code, so much time invested! What to do?

Options for all use cases

  • Before leaving University, contact the administrator of your site license. It could be that you are entitled to a discount on buying one of the various licenses on offer.
  • Use the CDF Player - With this free tool, You’ll be able to look at and interact (at least partially) with Mathematica notebooks.
  • Re-write all code to use something else. Which language to use is open to massive debate but the closest open source systems to Mathematica’s notebook-like interface are Jupyter (previously IPython) and Sage. The languages are, of course, rather different though!

Hobbyist use

General mucking around!

  • Buy the home edition - The home edition of Mathematica can be used for  non-professional and non-academic purposes and, at the time of writing, costs £195 as a one-off cost or £95 per year.
  • Use Mathematica online: Home – Same rules as the home edition above but it’s a cloud-based, online version. Currently costs £95 per year.
  • Buy a Raspberry Pi –  The Raspberry Pi comes with a free version of Mathematica! This means that you can buy an entire computer AND a copy of Mathematica for less than the standard home-use license. I had a play with Mathematica on the Raspberry Pi just over a year ago and it was very nice. Now that the faster, more powerful Raspberry Pi 2 has been released this option is even more compelling!

Academic use

If you want to use Mathematica in an academic environment that doesn’t have a site license, you’ll need to purchase an individual academic license. At the time of writing, that will cost £860 + VAT.

Professional use

There are various grades of professional license and the cost varies according to how many compute kernels you need or Wolfram Alpha API calls you want to make. Current prices start at £2,035 +VAT



January 15th, 2015

I recently had the good fortune to be involved in the creation of a European H2020 grant proposal called OpenDreamKit along with an international team from 15 institutions. My own contributions to this proposal were extremely modest and it was my first ever experience of being directly involved in an academic grant proposal. It’s the very first thing I’ve been involved with as part of my new appointment at The University of Sheffield.

Quoting from the proposal:

OpenDreamKit will deliver a flexible toolkit enabling research groups to set up Virtual Research Environments, customised to meet the varied needs of research projects in pure mathematics and applications and supporting the full research life-cycle from exploration, through proof and publication, to archival and sharing of data and code.

One of the many things that’s so great about this proposal is how it was written. Co-ordinated by Nicolas Thiéry, 33 contributors wrote it in LaTeX with version control provided by git and github. The video below, produced using gource,  is a visualation of the github repo over time and shows how we all danced around and with each other. My new manager, Neil Lawrence, who was much more deeply involved than I has good things to say about the process too.

The proposal was submitted yesterday after a lot of hard work and, as Nicholas Thiery commented in one of his emails to the group, is “Open from start to end :-)”

The Sage Facebook page summed up my thoughts about this project perfectly: “See the collaboration behind the *proposal*, and imagine the collaboration in the software!”

November 12th, 2014

I follow a lot of software developers on twitter so I get to see a lot of opinions and comments about git.  Here are a few of my recent favourites

Opinions on git
I find that git is a technology that polarizes people. They either love it or they hate it…often both at the same time.


Git Tutorials
I find that I learn something new every time I read a different tutorial

Git tips and tricks
No matter how long you’ve used git, you can always learn a new trick or two.

August 15th, 2014

Welcome to the rather delayed Carnival of Mathematics for July. This is the 113th edition of the mathematical blogging tradition that’s organised by Katie and company over at The Aperiodical.

Number Trivia and a challenge

Long held tradition dictates that I find something interesting about this month’s edition number – 113 – and I turn to Number Gossip for help. It comes up with 3 very nice little prime tidbits:

  • 113 is the smallest three-digit permutable prime
  • 113 is the smallest three-digit Unholey prime: such primes do not have holes in their digits
  • 113 is the smallest three-digit prime whose product and sum of digits is prime

Walking Randomly Challenge: Prove the above using the programming language of your choice and post in the comments section. Here’s a demonstration of the first statement using Mathematica.

(*Returns true if num is a permutable prime*)
permutePrimeQ[num_] := 
 AllTrue[PrimeQ[Map[FromDigits, Permutations[IntegerDigits[num]]]], 

(*prints all permutable primes from 1 to 1000*)
 If[permutePrimeQ[x], Print[x]]
 , {x, 1, 1000}


On with the show

Mathematical software blogs

A math carnival wouldn’t be a WalkingRandomly math carnival without some focus on mathematical software blogs. Loren of The Art of MATLAB brings us Analyzing Fitness Data from Wearable Devices in MATLAB which is a guest post by Toshi Takeuchi. The new computational programming language on the block is Julia and the Julia blog contains videos and code on mathematical Optimization in Julia. The Wolfram Blog has a video on how to Create Escher-Inspired art with Mathematica (Don’t forget that Mathematica can be had for free for the Raspberry Pi).

Sage is a free, open-source alternative to Mathematica, Maple and MATLAB and the SageMath blog recently published SageMathCloud — history and status. The Numerical Algorithms Blog, on the other hand, brings us Testing Matrix Functions Using Identities. Efficient linear algebra routines form one of the cornerstones of modern scientific computing and July saw the publication of a tutorial on how to write your own, super-fast Matrix-Matrix Multiply routine.

Stamps, Making Change and Dealing Cards

When was the last time you used a postage stamp? Even if it was a long time ago, you may have held in your hands a strip of stamps. Maybe you have even tried to fold it into a stack, one stamp wide, so that the strip was easier to store. Have you ever wondered how many ways there are to do so? This post reviews a recent research survey about the topic.

How many ways can you make change for a dollar? This post gives a Lisp program that solves the problem and an analytical solution based on generating functions.

Over at The Aperiodical, home of the math carnival, Dave Wilding has been Discovering Integer Sequences by dealing cards.

Ninjas, Lord Voldemort and Hairy Hay Balls

Colin Beveridge has written a delightful follow-up to Pat Ballew’s post which featured in CoM112 - Trigonometric Trick Secrets of the Mathematica Ninja.

Ben of Math with Bad Drawings fame has written a highly readable rant about a bit of syllabus design which will resonate with anyone teaching (or learning) mathematics in The Voldemort of Calculus Classes.

New math blogger, Grace, recently had an experience in which her math education intersected with her everyday life in Hay Ball Meets the Hairy Ball Theorem.

Tweeting bots, 3D Printed Geometry and Hair ties

(/begin shameless plug) I spend a lot of time on twitter posting as @walkingrandomly (/end shameless plug) and have discovered quite a few mathematical twitter bots in my time.  Evelyn Lamb has discovered many many more and has posted reviews of them all over at Roots of Unity.

James Tanton posts fun problems on twitter all the time. One particular problem caught the attention of Mike Lawler because of how 3D printing could help younger children see the geometry.

If you’ve ever wondered what a 120-cell would look like if it were made out of hair ties, wonder no longer because Andrea Hawksley has made one – Hair Tie 120-Cell (By the way, hold your mouse of the title of her blog – it’s cool!)

More Ninjas, Dots and Mandelbrots.

A round-up of fun stuff from Colin Beveridge’s “Flying Colours Maths” – a kind of @icecolbeveridge carnival for Mathematics Carnival.

Have you ever wondered what mathematics is behind those pretty Mandelbrot posters that are all over the place? Find out over at Grey Matters: Blog – Mandelbrot Set: What Exactly Are We Looking At, Anyway?

Did you know that dots have power? To see how much power, check out Keith Devlin’s article The Power of Dots.

Football, drugs and underwear

Since I’m not a big fan of football at the best of times, the football world cup is a distant memory for me (England’s dismal performance didn’t help much either).  Fortunately, there’s more than one way to enjoy a world cup and Maxwell’s Demon and The Guardian helped me enjoy it in a data science way: How shocking was Brazil’s 7-1 defeat, mathematically speaking? and Data Visualization, From The World Cup To Drugs In Arkansas.

At this time of year, many of us turn our thoughts to vacations. If you are a math geek, you owe it to yourself to optimise your underwear and pack like a nerd.

Puzzles, Games and playing like a mathmo

When I’m on vacation, I often take a notebook with me so I can do a little mathematics during downtime. My wife and friends find this extremely odd behaviour because mathematics looks like hard work to short, they don’t know how to play like a mathematician.

My job at The University of Manchester is wonderful because it often feels like I am being paid to solve puzzles. While on vacation, however, I have to find some unpaid puzzles to solve and this concentric circles puzzle is an example of one thats fun to solve.

I think that one of the best ways to learn is through play and games and, in a new post over at Math Frolic, a Li’l Game From Martin Gardner introduces mathematical “isomorphism.”

Cartoons and Limits

Mathematics is everywhere, it’s even in the voice bubbles used in web comics!  As an added bonus, this blog post contains a little Python programming too.

Next up, we have Part 2 of Bressoud’s masterful investigation (Part I featured in Carnival 112) of how students understand limits.

Resources, Exams and Books

Colleen Young brings us a great set of Standard Form Resources.

Patrick Honner continues his long-running evaluation of New York State math Regents exams, the high school required exams there. In this post, he looks at a multiple choice question that asks the student to identify the graph of an “absolute value equation”.

The Maths Book Club gives details of their most anticipated maths books for the rest of 2014.

Bad Mnemonics and a Dislike of Mathematics 

Andrea Hawklsey muses on why some people dislike mathematics when they have interests that suggest they should.  Perhaps it has something to do with poorly executed mnemonics when students are taught mathematics, or perhaps its just because they had dull math teachers? Most of my math teachers were awesome and I’ve always felt that this was a major factor in me enjoying the subject.

Skirts, Snow globes and Mathematical Mind Hacking 

This edition of the carnival is in danger of becoming the Carnival of Andrea Hawksley since so many of her great posts were submitted! In one of her July posts, she manages to combine fashion and hyperbolic geometry – which is quite a feat!

The author of cavmaths has been musing over the dynamics governing snow globes. Can anyone help out?

Hacking your mind sounds like it might be dangerous but it turns out that it’s really quite safe. Head over to Moebius Noodles to see an example of a mathematical mind hack.

That’s all folks

Thanks to Katie for inviting me to host this month’s carnival, thanks to everyone for submitting so many great articles and thanks to you for reading. Carnival #114 will be hosted over at SquareCirclez.

July 17th, 2014

Update: September 2014 – The notes in this blog post have been uploaded to github: The blog post will be kept as-is for posterity reasons. For the most up to date version of the notes, see the github version.

Some time in 2013, I helped out at a Software Carpentry event at The University of Bath.  As with most software carpentry boot camps, one of the topics covered was shell scripting and the scripting language of choice was bash.  As I wandered around the room, I asked the delegates which operating system they use for the majority of their research and the most popular answer, by far, was Windows.

This led me to wonder if we should teach using a native Windows solution rather than relying on bash?

A few years ago, this would be an insane proposition since the Windows command shell is very weak compared to bash.  PowerShell, on the other hand, is modern, powerful and installed on all modern Windows operating systems by default.

My problem was that I didn’t know PowerShell very well.  So, I took the notes for the 2013 Bath shell scripting session - - and gave myself the exercise of converting them to PowerShell.

I got close to completing this exercise last summer but various things took higher priority and so the project languished.  Rather than sit on the notes any longer, I’ve decided to just publish what I have so far in case they are useful to anyone.

You are free to use them with the following caveats

  • This is not necessarily the right way to teach PowerShell. It is an experiment in converting some classroom-tested Linux based notes to PowerShell.
  • If you use them, attribution would be nice. I am Mike Croucher, my site is Details on how to contact me at
  • I have not yet tested these notes in a classroom situation
  • These notes aren’t finished yet
  • These notes have been developed and tested on Windows 7.  Behaviour may be different using different versions of Windows.
  • These notes are given as they were left sometime in mid 2013. Some things may be out of date.
  • I was learning PowerShell as I developed these notes. As such, I fully expect them to be full of mistakes.  Corrections and improvements would be welcomed.

If anyone is interested in developing these notes into something that’s classroom-ready, contact me.

The old Windows Command Shell

The traditional Windows command shell is a program called cmd.exe which can trace its roots all the way back to the old, pre-Windows DOS prompt.

You can launch this command shell as follows

  • Hold down both the Windows button and the letter R to open the Run prompt
  • Type cmd and press Enter or click OK

Launch Cmd

  • You should see a window similar to the one below

Windows Command Prompt

The Windows command shell hasn’t changed significantly for over twenty years and is relatively feature poor compared to more modern shells. For this reason, it is recommended that you use Windows PowerShell instead. Mention of cmd.exe is only included here since, despite its deficiencies, it is still widely in use


To launch PowerShell:

  • Hold down both the Windows button and the letter R to open the Run prompt
  • Type powershell and press Enter or click OK

Launch PowerShell

  • You should see a window similar to the one below

PowerShell prompt

Note that although the header of the above window mentions v1.0, it could be a screenshot from either version 1.0 or version 2.0. This is a well-known bug. If you are using Windows 7 you will have version 2 at the minimum.

PowerShell versions

At the time of writing, PowerShell is at version 3. Ideally, you should at least have version 2.0 installed. To check version:


Major  Minor  Build  Revision
-----  -----  -----  --------
3      0      -1     -1

If this variable does not exist, you are probably using version 1.0 and should upgrade.

Version 3.0 is available at


# This is a comment in Powershell. It is not executed


Users of Bash will feel right at home at first since PowerShell appears to have the same set of commands

pwd                         #Path to current folder
ls                          #List directory
ls *.txt                    #Wild Card
ls *_hai*
ls -R                       #Recursive folder listing
ls .                        #List current folder
ls ..                       #List Parent folder
cd ..                       #Change current folder to parent. (Move up a folder)
cd ~                        #Change current folder to your user directory.
mkdir myfolder              #Create a folder
mkdir ~/myfolder    
mv myfolder new_myfolder    #rename myfolder to new_myfolder
rm -r new_myfolder          #Delete new_myfolder if its empty


cat file                    # View file
more file                   # Page through file
cat file | select -first 3  # first N lines
cat file | select -last 2   # Last N lines
cp file1 file2              # Copy
cp *.txt directory
rm file.txt                 # Delete - no recycle bin.
rm -r directory             # Recurse

Different command types in PowerShell: Aliases, Functions and Cmdlets

Many of the PowerShell ‘commands’ we’ve used so far are actually aliases to Powershell Cmdlets which have a Verb-Noun naming convention. We can discover what each command is an alias of using the get-alias cmdlet.

PS > get-alias ls

CommandType     Name                                                Definition
-----------     ----                                                ----------
Alias           ls                                                  Get-ChildItem

This shows that ls is an alias for the Cmdlet Get-ChildItem

A list of aliases for common Bash commands:

  • cat (Get-Content)
  • cd (Set-Location)
  • ls (Get-ChildItem)
  • pwd (Get-Location)

One reason why aliases were created is to make PowerShell a more familiar environment for users of other shells such as the old Windows cmd.exe or Linux’s Bash environment and also to save on typing.

You can get a list of all aliases using get-alias on its own.

PS > get-alias

Finally, here’s how you get all of the aliases for the Get-ChildItem cmdlet.

get-alias | where-object {$_.Definition -match "Get-Childitem"}

For more details on Powershell aliases, see Microsoft’s documentation at

What type of command is mkdir?

The mkdir command looks like it might be an alias as well since it doesn’t have the verb-noun naming convention of Cmdlets. Let’s try to see which Cmdlet it might be an alias of:

PS > get-alias mkdir

Get-Alias : This command cannot find a matching alias because alias with name 'mkdir' do not exist. 
At line:1 char:6
+ alias <<<<  mkdir
    + CategoryInfo          : ObjectNotFound: (mkdir:String) [Get-Alias], ItemNotFoundException
    + FullyQualifiedErrorId : ItemNotFoundException,Microsoft.PowerShell.Commands.GetAliasCommand

It turns out that mkdir isn’t an alias at all but is actually yet another PowerShell command type, a function. We can see this by using the get-command Cmdlet

PS > get-command mkdir
CommandType     Name                                                Definition
-----------     ----                                                ----------
Function        mkdir                                               ...
Application     mkdir.exe                                           C:\Program Files (x86)\Git\bin\mkdir.exe

Now we can clearly see that mkdir is a PowerShell function. The mkdir.exe is an Application which you’ll only see if you installed git for windows as I have.


A Cmdlet (pronounced ‘command-let’) is a .NET class but you don’t need to worry abut what this means until you get into advanced PowerShell usage. Just think of Cmdlets as the base type of PowerShell command. They are always named according to the convention verb-noun; for example Set-Location and Get-ChildItem.

Listing all Cmdlets

The following lists all Cmdlets


You can pipe this list to a pager

Get-Command | more

Getting help

You can get help on any PowerShell command using the -? switch. For example

ls -?

When you do this, you’ll get help for the Get-ChildItem Cmdlet which would be confusing if you didn’t know that ls is actually an alias for Get-ChildItem


Up arrow browses previous commands.

By default, PowerShell version 2 remembers the last 64 commands whereas PowerShell version 3 remembers 4096. This number is controlled by the $MaximumHistoryCount variable

PS > $MaximumHistoryCount           #Display the current value
PS > $MaximumHistoryCount=150       #Change it to 150
PS > history                        #Display recent history using the alias version of the command
PS > get-history                    #Display recent history using the Cmdlet direct

Although it remembers more, PowerShell only shows the last 32 commands by default. To see a different number, use the count switch

PS > get-history -count 50

To run the Nth command in the history use Invoke-History

PS > invoke-history 7

Word count (and more) using Measure-Object

Linux has a command called wc that counts the number of lines and words in a file. Powershell has no such command but we can do something similar with the Measure-Object Cmdlet.

Say we want to count the number of lines, words and characters in the file foo.txt. The first step is to get the content of the file

get-content foo.txt                 # gets the content of foo.txt

Next, we pipe the result of the get-content Cmdlet to Measure-Object, requesting lines, words and characters

get-content foo.txt | measure-object -line -character -word

The measure-object Cmdlet can also count files

ls *.txt | measure-object           #Counts number of .txt files in the current folder

When you execute the above command, a table of results will be returned:

Count    : 3
Average  :
Sum      :
Maximum  :
Minimum  :
Property :

This is because the measure-object Cmdlet, like all PowerShell Cmdlets, actually returns an object and the above table is the textual representation of that object.

The fields in this table hint that measure-object can do a lot more than simply count things. For example, here we find some statistics concerning the file lengths found by the ls *.txt command

ls *.txt | measure-object -property length -minimum -maximum -sum -average

You may wonder exactly what type of object has been returned from measure-object and we can discover this by running the gettype() method of the returned object

(ls *.txt | measure-object).gettype()

Request just the name as follows

(ls *.txt | measure-object).gettype().Name


To find out what properties an object has, pass it to the get-member Cmdlet

#Return all member types
ls *.txt | get-member

#Return only Properties
ls *.txt | get-member -membertype property

Sometimes, you’ll want to simply return the numerical value of an object’s property and you do this using the select-object Cmdlet. Here we ask for just the Count property of the GenericMeasureInfo object returned by measure-object.

#Counts the number of *.txt files and returns just the numerical result
ls *.txt | measure-object | select-object -expand Count

Searching within files

The Unix world has grep, PowerShell has Select String.  Try running the following on haiku.txt

Select-String the haiku.txt                             #Case insensitive by default, unlike grep
Select-String the haiku.txt -CaseSensitive              #Behaves more like grep
Select-String day haiku.txt -CaseSensitive
Select-String is haiku.txt -CaseSensitive
Select-String 'it is' haiku.txt -Casesensitive

There is no direct equivalent to grep’s -w switch.

grep -w is haiku.txt            #exact match

However, you can get the same behaviour using the word boundary anchors, \b

Select-String \bis\b haiku.txt -casesensitive

Grep has a -v switch that shows all lines that do not match a pattern. Select-String makes use of the -notmatch switch.

BASH: grep -v "is" haiku.txxt
PS: select-string -notmatch "is" haiku.txt -CaseSensitive

Grep has an -r switch which stands for ‘recursive’. The following will search through all files and subfolders of your current directory, looking for files that contain is

grep -r is *

Select-String has no direct equivalent to this. However, you can do the same thing by using get-childitem to get the list of files, piping the output to select-string

get-childitem * -recurse | select-string is

One difference between grep and Select-String is that the latter includes the filename and line number of each match.

grep the haiku.txt

Is not the true Tao, until
and the presence of absence:

Select-String the haiku.txt -CaseSensitive

haiku.txt:2:Is not the true Tao, until
haiku.txt:6:and the presence of absence:  

To get the grep-like output, use the following

Select-String the haiku.txt -CaseSensitive | ForEach-Object {$_.Line}

Is not the true Tao, until
and the presence of absence:

To understand how this works, you first have to know that Select-String returns an array of MatchInfo objects when there is more than one match. To demonstrate this:

$mymatches = Select-String the haiku.txt -CaseSensitive  #Put all matches in the variable 'mymatches'
$mymatches -is [Array]          #query if 'match' is an array


So, mymatches is an array. We can see how many elements it has using the array’s Count property



The type of elements in PowerShell arrays don’t necessarily have to be the same. In this case, however, they are.


both of these give the output

IsPublic IsSerial Name                                     BaseType
-------- -------- ----                                     --------
True     False    MatchInfo                                System.Object

If all you wanted was the name of the first object type, you’d do



Alternatively, we could have asked for each element’s type using the For-Each-Object Cmdlet to loop over every object in the array.

$mymatches | Foreach-Object {$_.gettype().Name}

Where $_ is a special variable that effectively means ‘current object’ or ‘The object currently being considered by Foreach-Object’ if you want to be more verbose.

So, we know that we have an array of 2 MatchInfo objects in our variable mymatches. What does this mean? What properties do MatchInfo objects have? We can find out by piping one of them to the Get-Member Cmdlet.

$mymatches[0] | Get-Member

   TypeName: Microsoft.PowerShell.Commands.MatchInfo

Name         MemberType Definition
----         ---------- ----------  
Equals       Method     bool Equals(System.Object obj)
GetHashCode  Method     int GetHashCode()
GetType      Method     type GetType()
RelativePath Method     string RelativePath(string directory)
ToString     Method     string ToString(), string ToString(string directory)
Context      Property   Microsoft.PowerShell.Commands.MatchInfoContext Context {get;se
Filename     Property   System.String Filename {get;}
IgnoreCase   Property   System.Boolean IgnoreCase {get;set;}
Line         Property   System.String Line {get;set;}   
LineNumber   Property   System.Int32 LineNumber {get;set;}
Matches      Property   System.Text.RegularExpressions.Match[] Matches {get;set;}
Path         Property   System.String Path {get;set;}
Pattern      Property   System.String Pattern {get;set;}    

Now we can see that each MatchInfo object has a Line property and it’s reasonable to guess that this contains a Line containing a match. Taking a look:


Is not the true Tao, until

Bringing together everything we’ve seen so far, we pull out the Line property of each element in the array as follows

$mymatches | Foreach-Object {$_.Line}

Alternatively, we can ditch the $mymatches variable and pipe in the output of Select-String directly

Select-String the haiku.txt -CaseSensitive | ForEach-Object {$_.Line}

Is not the true Tao, until
and the presence of absence:

Regular expressions

select-string 's*is' haiku.txt        # * Zero or more of preceding token
select-string 's+is' haiku.txt        # + On or more of preceding token
select-string '.nd'  haiku.txt        # . Any token followed by 'nd'
select-string 'es'   haiku.txt        # matches 'es'
select-string 'es[ht]' haiku.txt      # Exactly one of the characters listed
select-string 'es[^ht]' haiku.txt     # Matches everything except h and t
select-string 'ex[
select-string '\bis\b' haiku.txt      # \b word boundaries

Input and output redirection

> redirects output (AKA standard output). This works in both Bash and Powershell scripts. For example, in Bash we might do

grep -r not * > found_nots.txt

Drawing on what we've learned so far, you might write the PowerShell version of this command as

get-childitem *.txt -recurse | select-string not > found_nots.txt

However, if you do this, you will find that the script will run forever with the hard-disk chugging like crazy. If you've run the above command, CTRL and C will stop it. This is because Powershell is including the output file, found_nots.txt, in its input which leads to an infinite loop. To prevent this, we must explicitly exclude the output file from the get-childitem search

get-childitem *.txt -Exclude 'found_nots.txt' -recurse | select-string not > found_nots.txt

cat found_nots.txt
ls *.txt > txt_files.txt
cat txt_files.txt

In Linux, < redirects input (AKA standard input). This does not work in PowerShell:

cat < haiku.txt
At line:1 char:5
+ cat < haiku.txt
+     ~
The '<' operator is reserved for future use.
    + CategoryInfo          : ParserError: (:) [], ParentContainsErrorRecordException
    + FullyQualifiedErrorId : RedirectionNotSupported

The above is a forced use of < since one could simply do

cat haiku.txt

Recall that cat is an alias for get-content. The use of get-content is an idiom that gets around the lack an < operator. For example, instead of

foo < input.txt

One does

get-content input.txt | foo

Error messages are output on standard error

ls idontexist.txt > output.txt  
cat output.txt                  #output.txt is empty
ls idontexist.txt 2> output.txt               # 2 is standard error
ls haiku.txt 1> output.txt                    # 1 is standard output
ls haiku.txt,test_file.txt 2>&1 > output.txt  # Combine the two streams.

Searching for files

# Find all     
UNIX: find .
PS: get-childitem .  -Recurse 
PS: get-childitem .  -Recurse | foreach-object {$_.FullName}    #To give identical output as `find`   

To save on typing, you can use the alias gci instead of get-childitem

# Directories only
UNIX: find . -type d        
PS2: gci . -recurse | where { $_.PSIsContainer }
PS3: gci -recurse -Directory 

If you have PowerShell 2, you can only use the long winded version. It’s simpler in PowerShell 3. Similarly for searching for files only.

# Files only
UNIX: find . -type f          
PS2: get-childitem -recurse | where { ! $_.PSIsContainer }
PS3: gci -recurse -File

With the Unix find command, you can specify the maximum and minimum search depths. There is no direct equivalent in PowerShell although you could write a function that will do this. Such a function can be found at although I have not tested this!

# Maximum depth of tree
UNIX: find . -maxdepth 2
PS : No direct equivalent
# Minimum depth of tree
UNIX: find . -mindepth 3 
PS : No direct equivalent

You can also filter by name. Confusingly, PowerShell offers two ways of doing this. More details on the differences between these can be found at

One key difference between find and get-childitem is that the latter is case-insenstive whereas find is case sensitive.

# Find by name
UNIX: find . -name '*.txt' 
PS: gci -recurse -include *.txt
PS: gci -recurse -filter *.txt

#Find empty files
UNIX: find . -empty    
PS: gci -recurse | where ($_.Length -eq 0) | Select FullName 

#Create empty file
UNIX: touch emptyfile.txt
PS: new-item emptyfile.txt -type file

Command Substituion

In bash, you can execute a command using backticks and the result is substituted in place. i.e.

foo `bar`

The backticks are used as escape characters in PowerShell so you do the following instead

foo $(bar)

In both cases, the command bar is executed and result is substituted into the call to foo.

Power of the pipe

| is a pipe. Use the pipe to connect the output of one command to the input of another:

Count text files

ls -filter *.txt | measure

ls outputs a list of files, measure inputs a list of files.

echo "Number of .txt files:"; ls -filter *.txt | measure | select -expand count

; equivalent to running two commands on separate lines.

Question: what does this do?

ls -name | select-string s | measure

Answer: counts the number of files with s in their name.

history | select-string 'echo'

Power of well-defined modular components with well-defined interfaces,

  • Bolt together to create powerful computational and data processing workflows.
  • Good design principle applicable to programming – Python modules, C libraries, Java classes – modularity and reuse.
  • “little pieces loosely joined” – history + select-string = function to search for a command.


get-variable                            # See all variables
$MYFILE="data.txt"                      # Need quotes around strings
echo $MYFILE
echo "My file name is $MYFILE"
$num = 1                                #Numbers don't need quotes
$num = $num+1                           #Simple Arithmetic
$TEXT_FILES=get-childitem               Save output of get-childitem

Variables only persist for the duration of the current PowerShell Session

Environment variables

Windows environment variables don’t show up when you execute get-variable; to list them all you do

get-childitem env:                  #Show all Windows Environment variables 
echo $env:PATH                      #Show the contents of PATH
$env:Path = $env:Path + ";C:\YourApp\bin\"  #temporarily add a folder to PATH

This modification to PATH will only last as long as the current session. It is possible to permanently modify the system PATH but this should only be done with extreme care and is not covered here.

PowerShell Profile

The PowerShell profile is a script that is executed whenever you launch a new session. Every user has their own profile. The location of your PowerShell profile is defined by the variable $profile


Open it with

notepad $profile

Add something to it such as

echo "Welcome to PowerShell.  This is a message from your profile"

Restart PowerShell and you should see the message. You can use this profile to customise your PowerShell sessions. For example, if you have installed NotePad++, you might find adding the following function to your PowerShell Profile to be useful.

# Launch notepad++ using the npp command
function npp($file)
if ($file -eq $null)
        & "C:\Program Files (x86)\Notepad++\notepad++.exe";
        & "C:\Program Files (x86)\Notepad++\notepad++.exe" $file;

With this function in your profile, you can open Notepad++ with the command npp or npp(filename.txt)


$num = 1
if($num -eq 1)
  write-host 'num equals 1'

if($word -eq "hello")
  write-host 'The same'

By default, string tests are case insensitive

if($word -eq "HELLO")
  write-host 'The same'

To force them to be case sensitive, add a c to the operator:

if($word -ceq "HELLO")
  write-host 'The Same. This won't be printed'

You can similarly be explicitly case insensitive by adding an i. Although this is the the same behaviour as the undecorated operators and so might seem unnecessary, it shows your intent to the reader.

    if($word -ieq "HELLO")
      write-host 'The same'
Comparison Operators
-eq Equal to
-lt Less than
-gt Greater than
-ge Greater than or equal to
-le Less than or equal to
-ne Not equal to
Logical operators
-not    Not
!       Not
-or     Or
-and    And


PowerShell has several looping constructs. Here, I only consider two.


Allows you to run a block of code a set number of times.

for ($i=1; $i -le 5; $i=$i+1)
   Write-Host $i


Do something with every element of a collection

foreach($item in $(ls *.txt)) {echo $item.Name}

foreach vs foreach-object


Shell scripts

  • Save retyping.
  • PowerShell scripts have the extension .ps1
  • PowerShell scripts must be created with plain text editors such as Notepad or Notepad++. NEVER use Microsoft Word!

Here is a very simple script

notepad protein_filter.ps1                  #Open the file

#A simple protein filter
$DATE = get-date
echo "Processing date: $DATE"

foreach($item in get-childitem *.pdb)
   echo $item.Name

echo "Processing complete"

To run this just type the filename:


If you get an error message, it may be because your execution policy is set not to run scripts locally. Change this with the command

Set-ExecutionPolicy RemoteSigned            #Allow local scripts to run.  Needs to be run with admin privileges
protein_filter.ps1                          #Run the script

Download files via command-line

Primary Care Trust Prescribing Data – April 2011 onwards

$path="$pwd\$file"   #Path needs to be fully qualified. This puts it in the current folder
$url = "$file"
$client = new-object System.Net.WebClient
$client.DownloadFile( $url, $path )


Windows file permissions are rather more complicated than those of Linux but most users won’t need to worry about them in day to day use.

The call operator

It is sometimes convenient to construct strings that contain the full path to a PowerShell script we want to execute. For example:

$fullpath = "$pwd\myscript.ps1"

To actually run the script pointed to by this variable, you need to use the call operator &

& $fullpath                 #Runs myscript.ps1

You also need to do this if you try to call any script where the path contains spaces

"C:\Program Files\myscript.ps1"             #Will just display the string literal
& "C:\Program Files\myscript.ps1"           #Runs the script

Background Jobs

Consider the script counter.ps1

#counter.ps1: A simple, long running job to demonstrate background jobs


while ( $i -lt 200000 )
    echo $i

This counts up to 200000 in user-defined steps.

./counter.ps1   > 1step.txt                 #Counts in steps of 1
./counter.ps1 -step 2 > 2step.txt           #Counts in steps of 2

The script takes quite a while to complete and you cannot do anything else in your PowerShell session while it is working. Using the start-job Cmdlet, we can run counter.ps1 in the background

start-job -scriptblock { C:\Users\walkingrandomly\Dropbox\SSI_Windows\dir_full_of_files\some_directory\counter.ps1 >  C:\Users\walkingrandomly\Dropbox\SSI_Windows\dir_full_of_files\some_directory\outcount1.txt }

Note that you have to use the full path to both the counter.ps1 script and the output file. You can see the status of the job with get-job


Id              Name            State      HasMoreData     Location             Command
--              ----            -----      -----------     --------             -------
1              Job1             Running    True            localhost             C:\Users\walkingrando...

Eventually, your job will complete


Id              Name            State      HasMoreData     Location             Command
--              ----            -----      -----------     --------             -------
1               Job1            Completed  False           localhost             C:\Users\walkingrando...

ls outcount*                    #Ensure that output file has been created
remove-job 1                    #remove remnants of job 1 from the queue
get-job                         #Check that queue is empty

You can run as many simultaneous jobs as you like but it is best not to run too many or your computer will grind to a halt.

Here’s an example that runs 5 counter.ps1 jobs concurrently

#Runs 5 instances of counter.ps1 in parallel

$scriptname     = "counter.ps1"
$outputfileBase = "outfile"
$outputfileExt  = ".txt"

$scriptPath = "$pwd\$scriptname"

for ($i=1; $i -le 5; $i++)
  $outputfilePath = "$pwd\$outputfileBase" + $i + $outputfileExt
  $command = "$scriptPath -step $i `> $outputfilePath"
  $myScriptBlock = [scriptblock]::Create($command)
  start-job -scriptblock $myScriptBlock

Run this as a demonstration

get-job                             #Keep running until all have completed
ls outfile*
more outfile5.txt
more outfile2.txt
$myjob=get-job 2                    #Get info on job Id 2 and store in variable $myjob
$myjob.Command                      #Look at the command that comprised job 2
remove-job *                        #Remove all job remnants from the queue
get-job                             #Should be empty

TODO: Dealing with output, recieve-job

Secure Shell

There is no equivalent to the Linux commands ssh and sftp in PowerShell. The following free programs are recommended


There are no direct PowerShell equivalents to zip, unzip, tar etc. There are write-zip, write-tar and write-gzip cmdlets in the third party, free PowerShell Community Extensions but I have not investigated them yet.


Start-transcript Initializes a transcript file which records all subsequent input/Output. Use the following syntax:

Start-Transcript [[-path] FilePath] [-force] [-noClobber] [-append]

Stop-transcript Stops recording and finalizes the transcript.

start-transcript -path ./diary.txt
echo "Hello dear diary"
cat diary.txt
  • Record commands typed, commands with lots of outputs, trial-and-error when building software.
  • Send exact copy of command and error message to support.
  • Turn into blog or tutorial.

Shell power

(Bentley, Knuth, McIlroy 1986) Programming pearls: a literate program Communications of the ACM, 29(6), pp471-483, June 1986. DOI: [10.1145/5948.315654].

Dr. Drang, More shell, less egg, December 4th, 2011.

Common words problem: read a text file, identify the N most frequently-occurring words, print out a sorted list of the words with their frequencies.

10 plus pages of Pascal … or … 1 line of shell

#BASH version
$ nano
tr -cs A-Za-z '\n' | tr A-Z a-z | sort | uniq -c | sort -rn | sed ${1}q
$ chmod +x
$ nano <
$ nano < 10

The PowerShell version is more complicated but still very short compared to the 10 pages of Pascal

$text=get-content $filename
$split = foreach($line in $text) {$line.tolower() -split "\W"}
$split | where-object{-not [string]::IsNullorEmpty($_)} |  group -noelement | sort Count -Descending | select -first $num 10

“A wise engineering solution would produce, or better, exploit-reusable parts.” – Doug McIlroy


July 10th, 2014


Mathematica 10 was released yesterday amid the usual marketing storm we’ve come to expect for a new product release from Wolfram Research. Some of the highlights of this marketing information include

Without a doubt, there is a lot of great new functionality in this release and I’ve been fortunate enough to have access to some pre-releases for a while now.  There is no chance that I can compete with the in-depth treatments given by Wolfram Research of the 700+ new functions in Mathematica 10 so I won’t try.  Instead, I’ll hop around some of the new features, settling on those that take my fancy.

Multiple Undo

I’ve been a Mathematica user since version 4 of the product which was released way back in 1999.  For the intervening 15 years, one aspect of Mathematica that always frustrated me was the fact that the undo feature could only undo the most recent action. I, along with many other users, repeatedly asked for multiple undo to be implemented but were bitterly disappointed for release after release.

Few things have united Mathematica users more than the need for multiple undo:

Finally, with version 10, our hopes and dreams have been answered and multiple undo is finally here. To be perfectly honest, THIS is the biggest reason to upgrade to version 10. Everything else is just tasty gravy.

Key new areas of functionality

Most of the things considered in this article are whimsical and only scratch the surface of what’s new in Mathematica 10.  I support Mathematica (and several other products) at the University of Manchester in the UK and so I tend to be interested in things that my users are interested in. Here’s a brief list of new functionality areas that I’ll be shouting about

  • Machine Learning I know several people who are seriously into machine learning but few of them are Mathematica users.  I’d love to know what they make of the things on offer here.
  • New image processing functions The Image Processing Toolbox is one of the most popular MATLAB toolboxes on my site. I wonder if this will help turn MATLAB-heads. I also know people in a visualisation group who may be interested in the new 3D functions on offer.
  • Nonlinear control theoryVarious people in our electrical engineering department are looking at alternatives to MATLAB for control theory. Maple/Maplesim and Scilab/Xcos are the key contenders. SystemModeler is too expensive for us to consider but the amount of control functionality built into Mathematica is useful.

Entities – a new data type for things that Mathematica knows stuff about

One of the new functions I’m excited about is GeoGraphics that pulls down map data from Wolfram’s servers and displays them in the notebook.  Obviously, I did not read the manual and so my first attempt at getting a map of the United Kingdom was

GeoGraphics["United Kingdom"]

What I got was a map of my home town, Sheffield, surrounded by a red cell border indicating an error message

Screen Shot 2014-07-10 at 00.52.53


The error message is “United Kingdom is not a Graphics primitive or directive.”  The practical upshot of this is that GeoGraphics is not built to take strings as arguments.  Fair enough, but why the map of Sheffield?  Well, if you call GeoGraphics[] on its own, the default action is to return a map centred on the GeoLocation found by considering your current IP address and it seems that it also does this if you send something bizarre to GeoGraphics.  In all honesty, I’d have preferred no map and a simple error message.

In order to get what I want, I have to pass an Entity that represents the UK to the GeoGraphics function.  Entities are a new data type in Mathematica 10 and, as far as I can tell, they formally represent ‘things’ that Mathematica knows about. There are several ways to create entities but here I use the new Interpreter function

Screen Shot 2014-07-10 at 01.20.15

From the above, you can see that Entities have a special new StandardForm but their InputForm looks straightforward enough. One thing to bear in mind here is that all of the above functions require a live internet connection in order to work.  For example, on thinking that I had gotten the hang of the Entity syntax, I switched off my internet connection and did

mycity = Entity["City", "Manchester"]

During evaluation of In[92]:= URLFetch::invhttp: Couldn't resolve host name. >>

During evaluation of In[92]:= WolframAlpha::conopen: Using WolframAlpha requires internet connectivity. Please check your network connection. You may need to configure your firewall program or set a proxy in the Internet Connectivity tab of the Preferences dialog. >>

Out[92]= Entity["City", "Manchester"]

So, you need an internet connection even to create entities at this most fundamental level.  Perhaps it’s for validation?  Turning the internet connection back on and re-running the above command removes the error message but the thing that’s returned isn’t in the new StandardForm:

Screen Shot 2014-07-10 at 01.35.08

If I attempt to display a map using the mycity variable, I get the map of Sheffield that I currently associate with something having gone wrong (If I’d tried this out at work,in Manchester, on the other hand, I would think it had worked perfectly!).  So, there is something very wrong with the entity I am using here – It doesn’t look right and it doesn’t work right – clearly that connection to WolframAlpha during its creation was not to do with validation (or if it was, it hasn’t helped).  I turn back to the Interpreter function:

In[102]:= mycity2 = Interpreter["City"]["Manchester"] // InputForm

Out[102]//InputForm:= Entity["City", {"Manchester", "Manchester", "UnitedKingdom"}]

So, clearly my guess at how a City entity should look was completely incorrect.  For now, I think I’m going to avoid creating Entities directly and rely on the various helper functions such as Interpreter.

What are the limitations of knowledge based computation in Mathematica?

All of the computable data resides in the Wolfram Knowledgebase which is a new name for the data store used by Wolfram Alpha, Mathematica and many other Wolfram products. In his recent blog post, Stephen Wolfram says that they’ll soon release the Wolfram Discovery Platform  which will allow large scale access to the Knowledgebase and indicated that ‘basic versions of Mathematica 10 are just set up for small-scale data access.’  I have no idea what this means and what limitations are in place and can’t find anything in the documentation.

Until I understand what limitations there might be, I find myself unwilling to use these data-centric functions for anything important.

IntervalSlider – a new control for Manipulate

I’ll never forget the first time I saw a Mathematica Manipulate – it was at the 2006 International Mathematica Symposium in Avignon when version 6 was still in beta. A Wolfram employee created a fully functional, interactive graphical user interface with just a few lines of code in about 2 minutes –I’d never seen anything like it and was seriously excited about the possibilities it represented.

8 years and 4 Mathematica versions later and we can see just how amazing this interactive functionality turned out to be. It forms the basis of the Wolfram Demonstrations project which currently has 9677 interactive demonstrations covering dozens of fields in engineering, mathematics and science.

Not long after Mathematica introduced Manipulate, the sage team introduced a similar function called interact. The interact function had something that Manipulate did not – an interval slider (see the interaction called ‘Numerical integrals with various rules’ at for an example of it in use). This control allows the user to specify intervals on a single slider and is very useful in certain circumstances.

As of version 10, Mathematica has a similar control called an IntervalSlider.  Here’s some example code of it in use

 pl1 = Plot[Sin[x], {x, -Pi, Pi}];
 pl2 = Plot[Sin[x], {x, range[[1]], range[[2]]}, Filling -> Axis, 
   PlotRange -> {-Pi, Pi}]; 
 inset = Inset["The Integral is approx " <> ToString[
      Integrate[Sin[x], {x, range[[1]], range[[2]]}]
      , {3, 3}, ExponentFunction -> (Null &)]], {2, -0.5}];
 Show[{pl1, pl2}, Epilog -> inset], {{range, {-1.57, 1.57}}, -3.14, 
  3.14, ControlType -> IntervalSlider, Appearance -> "Labeled"}]

and here’s the result:


Associations – A new kind of brackets

Mathematica 10 brings a new fundamental data type to the language, associations. As far as I can tell, these are analogous to dictionaries in Python or Julia since they consist of key,value pairs.  Since Mathematica has already used every bracket type there is, Wolfram Research have had to invent a new one for associations.

Let’s create an association called scores that links 3 people to their test results

In[1]:= scores = <|"Mike" -> 50.2, "Chris" -> 100.00, "Johnathan" -> 62.3|>

Out[1]= <|"Mike" -> 50.2, "Chris" -> 100., "Johnathan" -> 62.3|>

We can see that the Head of the scores object is Association

In[2]:= Head[scores]

Out[2]= Association

We can pull out a value by supplying a key. For example, let’s see what value is associated with “Mike”

In[3]:= scores["Mike"]

Out[3]= 50.2

All of the usual functions you expect to see for dictionary-type objects are there:-

In[4]:= Keys[scores]

Out[4]= {"Mike", "Chris", "Johnathan"}

In[5]:= Values[scores]

Out[5]= {50.2, 100., 62.3}

In[6]:= (*Show that the key "Bob" does not exist in scores*)
KeyExistsQ[scores, "Bob"]

Out[6]= False

If you ask for a key that does not exist this happens:

In[7]:= scores["Bob"]

Out[7]= Missing["KeyAbsent", "Bob"]

There’s a new function called Counts that takes a list and returns an association which counts the unique elements in the list:

In[8]:= Counts[{q, w, e, r, t, y, u, q, w, e}]

Out[8]= <|q -> 2, w -> 2, e -> 2, r -> 1, t -> 1, y -> 1, u -> 1|>

Let’s use this to find something out something interesting, such as the most used words in the classic text, Moby Dick

In[1]:= (*Import Moby Dick from Project gutenberg*)

MobyDick = 
(*Split into words*)
words = StringSplit[MobyDick];
(*Convert all words to lowercase*)

words = Map[ToLowerCase[#] &, words];
(*Create an association of words and corresponding word count*)

wordcounts = Counts[words];
(*Sort the association by key value*)

wordcounts = Sort[wordcounts, Greater];
(*Show the top 10*)
wordcounts[[1 ;; 10]]

Out[6]= <|"the" -> 14413, "of" -> 6668, "and" -> 6309, "a" -> 4658, 
 "to" -> 4595, "in" -> 4116, "that" -> 2759, "his" -> 2485, 
 "it" -> 1776, "with" -> 1750|>

All told, associations are a useful addition to the Mathematica language and I’m happy to see them included.  Many existing functions have been updated to handle Associations making them a fundamental part of the language.

s/Mathematica/Wolfram Language/g

I’ve been programming in Mathematica for well over a decade but the language is no longer called ‘Mathematica’, it’s now called ‘The Wolfram Language.’  I’ll not lie to you, this grates a little but I guess I’ll just have to get used to it.  Flicking through the documentation, it seems that a global search and replace has happened and almost every occurrence of ‘Mathematica’ has been changed to ‘The Wolfram Language’

This is part of a huge marketing exercise for Wolfram Research and I guess that part of the reason for doing it is to shift the emphasis away from mathematics to general purpose programming.  I wonder if this marketing push will increase the popularity of The Wolfram Language as measured by the TIOBE index? Neither ‘Mathematica’ or ‘The Wolfram Language’ is listed in the top 100 and last time I saw more detailed results had it at number 128.

Fractal exploration

One of Mathematica’s competitors, Maple, had a new release recently which saw the inclusion of a set of fractal exploration functions. Although I found this a fun and interesting addition to the product, I did think it rather an odd thing to do.  After all, if any software vendor is stuck for functionality to implement, there is a whole host of things to do that rank higher in most user’s list of priorities than a function that plots a standard fractal.

It seems, however, that both Maplesoft and Wolfram Research have seen a market for such functionality.  Mathematica 10 comes with a set of functions for exploring the Mandelbrot and Julia sets. The Mandelbrot set alone accounts for at least 5 of Mathematica 10′s 700 new functions:- MandelbrotSetBoettcherMandelbrotSetDistanceMandelbrotSetIterationCount, MandelbrotSetMemberQ and MandelbrotSetPlot.




I found this more fun than is reasonable!  Mathematica can generate and recognize bar codes and QR codes in various formats.  For example

BarcodeImage["", "QR"]


Scanning the result using my mobile phone brings me right back home :)

Unit Testing

A decent unit testing framework is essential to anyone who’s planning to do serious software development. Python has had one for years, MATLAB got one in 2013a and now Mathematica has one.  This is good news!  I’ve not had chance to look at it in any detail, however. For now, I’ll simply nod in approval and send you to the documentation. Opinions welcomed.

Disappointments in Mathematica 10

There’s a lot to like in Mathematica 10 but there’s also several aspects that disappointed me

No update to RLink

Version 9 of Mathematica included integration with R which excited quite a few people I work with. Sadly, it seems that there has been no work on RLink at all between version 9 and 10.  Issues include:

  • The version of R bundled with RLink is stuck at 2.14.0 which is almost 3 years old. On Mac and Linux, it is not possible to use your own installation of R so we really are stuck with 2.14. On Windows, it is possible to use your own installation of R but CHECK THAT version 3 issue has been fixed
  • It is only possible to install extra R packages on Windows. Mac and Linux users are stuck with just base R.

This lack of work on RLink really is a shame since the original release was a very nice piece of work.

If the combination of R and notebook environment is something that interests you, I think that the current best solution is to use the R magics from within the IPython notebook.

No update to CUDA/OpenCL functions

Mathematica introduced OpenCL and CUDA functionality back in version 8 but very little appears to have been done in this area since. In contrast, MATLAB has improved on its CUDA functionality (it has never supported OpenCL) every release since its introduction in 2010b and is now superb!

Accelerating computations using GPUs is a big deal at the University of Manchester (my employer) which has a GPU-club made up of around 250 researchers. Sadly, I’ll have nothing to report at the next meeting as far as Mathematica is concerned.

FinancialData is broken (and this worries me more than you might expect)

I wrote some code a while ago that used the FinancialData function and it suddenly stopped working because of some issue with the underlying data source. In short, this happens:

In[12]:= FinancialData["^FTAS", "Members"]

Out[12]= Missing["NotAvailable"]

This wouldn’t be so bad if it were not for the fact that an example given in Mathematica’s own documentation fails in exactly the same way! The documentation in both version 9 and 10 give this example:

In[1]:= FinancialData["^DJI", "Members"]

Out[1]= {"AA", "AXP", "BA", "BAC", "CAT", "CSCO", "CVX", "DD", "DIS", \
"GE", "HD", "HPQ", "IBM", "INTC", "JNJ", "JPM", "KFT", "KO", "MCD", \
"MMM", "MRK", "MSFT", "PFE", "PG", "T", "TRV", "UTX", "VZ", "WMT", \

but what you actually get is

In[1]:= FinancialData["^DJI", "Members"]

Out[1]= Missing["NotAvailable"]

For me, the implications of this bug are far more reaching than a few broken examples.  Wolfram Research are making a big deal of the fact that Mathematica gives you access to computable data sets, data sets that you can just use in your code and not worry about the details.

Well, I did just as they suggest, and it broke!


I’ve had a lot of fun playing with Mathematica 10 but that’s all I’ve really done so far – play – something that’s probably obvious from my choice of topics in this article. Even through play, however, I can tell you that this is a very solid new release with some exciting new functionality. Old-time Mathematica users will want to upgrade for multiple-undo alone and people new to the system have an awful lot of toys to play with.

Looking to the future of the system, I feel excited and concerned in equal measure. There is so much new functionality on offer that it’s almost overwhelming and I love the fact that its all integrated into the core system. I’ve always been grateful of the fact that Mathematica hasn’t gone down the route of hiving functionality off into add-on products like MATLAB does with its numerous toolboxes.

My concerns center around the data and Stephen Wolfram’s comment ‘basic versions of Mathematica 10 are just set up for small-scale data access.’  What does this mean? What are the limitations and will this lead to serious users having to purchase add-ons that would effectively be data-toolboxes?


Have you used Mathematica 10 yet? If so, what do you think of it? Any problems? What’s your favorite function?

Mathematica 10 links


December 11th, 2013

Back in 2008, I featured various mathematical Christmas cards generated from mathematical software.  Here are the original links (including full source code for each ‘card’) along with a few of the images. Contact me if you’d like to add something new.

MATLAB Christmas card

SAGE Christmas greetings

Fern Fractal Christmas card

Comments Off
December 6th, 2013

A question I get asked a lot is ‘How can I do nonlinear least squares curve fitting in X?’ where X might be MATLAB, Mathematica or a whole host of alternatives.  Since this is such a common query, I thought I’d write up how to do it for a very simple problem in several systems that I’m interested in

This is the MATLAB version. For other versions,see the list below

The problem

xdata = -2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9
ydata = 0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001

and you’d like to fit the function

 F(p_1,p_2,x) = p_1 \cos(p_2 x)+p_2 \sin(p_1 x)

using nonlinear least squares.  You’re starting guesses for the parameters are p1=1 and P2=0.2

For now, we are primarily interested in the following results:

  • The fit parameters
  • Sum of squared residuals

Future updates of these posts will show how to get other results such as confidence intervals. Let me know what you are most interested in.

How you proceed depends on which toolboxes you have. Contrary to popular belief, you don’t need the Curve Fitting toolbox to do curve fitting…particularly when the fit in question is as basic as this. Out of the 90+ toolboxes sold by The Mathworks, I’ve only been able to look through the subset I have access to so I may have missed some alternative solutions.

Pure MATLAB solution (No toolboxes)

In order to perform nonlinear least squares curve fitting, you need to minimise the squares of the residuals. This means you need a minimisation routine.  Basic MATLAB comes with the fminsearch function which is based on the Nelder-Mead simplex method.  For this particular problem, it works OK but will not be suitable for more complex fitting problems.  Here’s the code

format compact
format long
xdata = [-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9];
ydata = [0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001];

%Function to calculate the sum of residuals for a given p1 and p2
fun = @(p) sum((ydata - (p(1)*cos(p(2)*xdata)+p(2)*sin(p(1)*xdata))).^2);

%starting guess
pguess = [1,0.2];

[p,fminres] = fminsearch(fun,pguess)

This gives the following results

p =
1.881831115804464 0.700242006994123
fminres =

All we get here are the parameters and the sum of squares of the residuals.  If you want more information such as 95% confidence intervals, you’ll have a lot more hand-coding to do. Although fminsearch works fine in this instance, it soon runs out of steam for more complex problems.

MATLAB with optimisation toolbox

With respect to this problem, the optimisation toolbox gives you two main advantages over pure MATLAB.  The first is that better optimisation routines are available so more complex problems (such as those with constraints) can be solved and in less time.  The second is the provision of the lsqcurvefit function which is specifically designed to solve curve fitting problems.  Here’s the code

format long
format compact
xdata = [-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9];
ydata = [0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001];

%Note that we don't have to explicitly calculate residuals
fun = @(p,xdata) p(1)*cos(p(2)*xdata)+p(2)*sin(p(1)*xdata);

%starting guess
pguess = [1,0.2];

[p,fminres] = lsqcurvefit(fun,pguess,xdata,ydata)

This gives the results

p =
   1.881848414551983   0.700229137656802
fminres =

MATLAB with statistics toolbox

There are two interfaces I know of in the stats toolbox and both of them give a lot of information about the fit. The problem set up is the same in both cases

%set up for both fit commands in the stats toolbox
xdata = [-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9];
ydata = [0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001];

fun = @(p,xdata) p(1)*cos(p(2)*xdata)+p(2)*sin(p(1)*xdata);

pguess = [1,0.2];

Method 1 makes use of

mdl =,ydata,fun,pguess)

The returned object is a NonLinearModel object

ans =

which contains all sorts of useful stuff

mdl = 
Nonlinear regression model:
    y ~ p1*cos(p2*xdata) + p2*sin(p1*xdata)

Estimated Coefficients:
          Estimate             SE                 
    p1      1.8818508110535      0.027430139389359
    p2    0.700229815076442    0.00915260662357553

          tStat               pValue              
    p1    68.6052223191956    2.26832562501304e-12
    p2    76.5060538352836    9.49546284187105e-13

Number of observations: 10, Error degrees of freedom: 8
Root Mean Squared Error: 0.082
R-Squared: 0.996,  Adjusted R-Squared 0.995
F-statistic vs. zero model: 1.43e+03, p-value = 6.04e-11

If you don’t need such heavyweight infrastructure, you can make use of the statistic toolbox’s nlinfit function

[p,R,J,CovB,MSE,ErrorModelInfo] = nlinfit(xdata,ydata,fun,pguess);

Along with our parameters (p) this also provides the residuals (R), Jacobian (J), Estimated variance-covariance matrix (CovB), Mean Squared Error (MSE) and a structure containing information about the error model fit (ErrorModelInfo).

Both nlinfit and use the same minimisation algorithm which is based on Levenberg-Marquardt

MATLAB with Symbolic Toolbox

MATLAB’s symbolic toolbox provides a completely separate computer algebra system called Mupad which can handle nonlinear least squares fitting via its stats::reg function.  Here’s how to solve our problem in this environment.

First, you need to start up the mupad application.  You can do this by entering


into MATLAB. Once you are in mupad, the code looks like this

xdata := [-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9]:
ydata := [0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001]:

The result returned is

[[1.88185085, 0.7002298172], 0.05381269642]

These are our fitted parameters,p1 and p2, along with the sum of squared residuals. The documentation tells us that the optimisation algorithm is Levenberg-Marquardt– this is rather better than the simplex algorithm used by basic MATLAB’s fminsearch.

MATLAB with the NAG Toolbox

The NAG Toolbox for MATLAB is a commercial product offered by the UK based Numerical Algorithms Group.  Their main products are their C and Fortran libraries but they also have a comprehensive MATLAB toolbox that contains something like 1500+ functions.  My University has a site license for pretty much everything they put out and we make great use of it all.  One of the benefits of the NAG toolbox over those offered by The Mathworks is speed.  NAG is often (but not always) faster since its based on highly optimized, compiled Fortran.  One of the problems with the NAG toolbox is that it is difficult to use compared to Mathworks toolboxes.

In an earlier blog post, I showed how to create wrappers for the NAG toolbox to create an easy to use interface for basic nonlinear curve fitting.  Here’s how to solve our problem using those wrappers.

format long
format compact
xdata = [-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9];
ydata = [0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001];

%Note that we don't have to explicitly calculate residuals
fun = @(p,xdata) p(1)*cos(p(2)*xdata)+p(2)*sin(p(1)*xdata);
start = [1,0.2];

which gives

Warning: nag_opt_lsq_uncon_mod_func_easy (e04fy) returned a warning indicator (5) 
p =
   1.881850904268710   0.700229557886739
fminres =

For convenience, here’s the two files you’ll need to run the above (you’ll also need the NAG Toolbox for MATLAB of course)

MATLAB with curve fitting toolbox

One would expect the curve fitting toolbox to be able to fit such a simple curve and one would be right :)

xdata = [-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9];
ydata = [0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001];

opt = fitoptions('Method','NonlinearLeastSquares',...
f = fittype('p1*cos(p2*x)+p2*sin(p1*x)','options',opt);

fitobject = fit(xdata',ydata',f)

Note that, unlike every other Mathworks method shown here, xdata and ydata have to be column vectors. The result looks like this

fitobject = 

     General model:
     fitobject(x) = p1*cos(p2*x)+p2*sin(p1*x)
     Coefficients (with 95% confidence bounds):
       p1 =       1.882  (1.819, 1.945)
       p2 =      0.7002  (0.6791, 0.7213)

fitobject is of type cfit:


ans =

In this case it contains two fields, p1 and p2, which are the parameters we are looking for

>> fieldnames(fitobject)
ans = 
>> fitobject.p1
ans =
>> fitobject.p2
ans =

For maximum information, call the fit command like this:

[fitobject,gof,output] = fit(xdata',ydata',f)

fitobject = 

     General model:
     fitobject(x) = p1*cos(p2*x)+p2*sin(p1*x)
     Coefficients (with 95% confidence bounds):
       p1 =       1.882  (1.819, 1.945)
       p2 =      0.7002  (0.6791, 0.7213)

gof = 

           sse: 0.053812696487326
       rsquare: 0.995722238905101
           dfe: 8
    adjrsquare: 0.995187518768239
          rmse: 0.082015773244637

output = 

           numobs: 10
         numparam: 2
        residuals: [10x1 double]
         Jacobian: [10x2 double]
         exitflag: 3
    firstorderopt: 3.582047395989108e-05
       iterations: 6
        funcCount: 21
     cgiterations: 0
        algorithm: 'trust-region-reflective'
          message: [1x86 char]