Nov 5, 2014 - Profiling Parallel Programs

In high performance computing, performance is kind of a big deal. And the first step in performance analysis and performance improvement is profiling.

High performance computing almost always entails some form of parallelism. And parallel programs are plain hard. They're harder to write, harder to debug, and harder to profile.

gprof

gprof is pretty great. Just compile your code with -pg, and -g,

$ gcc -pg -g -O0 hello.c bye.c -o hibye.exe

run your code as usual,

$ ./hibye.exe

and you'll see gmon.out. Now,

$ gprof hibye.exe gmon.out

should summarize the performance of your code. Beware, gprof will not pick up on any calls to shared library functions. OK, that's a downer, and there's lots more. But it's easy to use, and gives me quick results. With the legacy code I work with, where there are no shared library calls, gprof is pretty awesome.

gprof + MPI

gprof isn't designed to work with MPI code. But, as is generally the case with these things, it's possible with sufficient abuse:

First, set the environment variable GMON_OUT_PREFIX:

$ export GMON_OUT_PREFIX=gmon.out-

Then, the usual business:

$ mpicc -pg -g -O0 hello.c bye.c -o hibye.exe
$ mpiexec -n 32 hibye.exe

You should see 32 (or however many processes) files, with names gmon.out-<pid>. This is an undocumented feature of glibc, and it really shouldn't be - it's massively useful.

Now you have a separate gmon.out file for every MPI process. Awesome. Sum them:

$ gprof -s hibye.exe gmon.out-*

And use the resulting gmon.sum to generate gprof output:

$ gprof hibye.exe gmon.sum

Credit where it's due. Now, I haven't figured out how to replace the pid with the MPI rank - this could be exponentially more useful to some users. And the method mentioned in the source doesn't really seem to be working. But I'm sure this is possible with some ingenuity.

mpiP

mpiP is a neat little tool for profiling MPI applications. In particular, it's extremely useful in figuring out how much your application is spending time communicating relative to computing.

The documentation for setting up and using mpiP is complete (good), but small (better). Once you have mpiP set up, profiling your code is as easy as linking it with the mpiP library and some other stuff it needs:

$ mpicc -g -O0 hello.c bye.c -o hibye.exe -lmpiP -liberty -lbfd -lunwind

Running your code (mpiexec) will produce mpiP output.

I've found that while gprof and mpiP are great tools that do different things, using them both gives me a very good idea of where my programs are spending time and where I should focus optimization efforts.

Oct 15, 2014 - Testing Scientific Code

Does My Code Work?

People often tell me they're writing a program, or writing code that does "x". There's a certain hesitation, contempt almost, in the scientific community, associated with the word software - as if writing software just isn't something scientists do. I suspect that this has to do with what calling it software says about your code - that it works. It's reliable, it's friendly, and most importantly, it's testable.

So, is scientific code really testable? I heard a story recently about a chemical engineer who said,

If I knew what my code was supposed to do, I would have published by now.

I see the point. Something that characterizes scientific code is that we write it because we don't know what the answer is. Therefore, we can't write tests for it.

And so testing scientific code is an oxymoron.

And we shouldn't waste our time trying to do it.

Right?

We're guaranteed by the scientific method that our answers are testable. In fact, it's this testability that differentiates science from non-science. If you're going to publish your results, you're going to have to convince somebody that your code works. Before you do that, you have to convince yourself that your code works. And whether or not you call that bit testing, you're still doing it. What's more - you can document the way you're doing it, often in code. And without realizing it, you've tested your code.

How are scientists grad students testing code?

OK. I Don't have the data. I can't tell you how scientists are testing their code. But I can extrapolate from the way I've tested code, and seen other grad students test code.

It looks OK

"It looks OK" is generally applied during earlier stages of code development. You write some code, summarize some results, e.g., plot a graph or print some statistics, and if it looks OK, move on. This is by no means a poor method of testing: it provides quick, frequent feedback and can help fix bugs early. It is of course, incomplete, and provides no new information other than 'something is wrong somewhere'.

It looks OK is particularly effective in interactive programming environments, where you don't have to do an edit-compile-link-run cycle before you can perform a check.

Test Oracle

Most grad students write code that aspires to reproduce some results: either experimental data, or results from another code - known as test oracles. The effectiveness of the method obviously depends on how good the oracles are and the features of the software that they test. For example, does the experimental data collected exercise the entire range of input parameters that the code is going to be used for? Does it cover edge cases?

Test oracles are great in the Verification and Validation stage of development. But they're not so useful in the earlier development cycles, and it's a terrible idea to wait that long before doing any testing.

Unit Tests

Some grad students write unit tests. The idea is that if you test small pieces of code individually and extensively, then you can be confident that your software works. We're also taught that writing code that can be unit-tested leads to better design, i.e., by dividing your code into small pieces that can each be run independently, you're getting modularity, reuseability and all that good stuff in addition to testability.

I'm going to admit it. I'm a complete sucker for unit testing -- the idea behind unit testing. In practice, I write tests that resemble unit tests, but they might test larger bits of code than typical unit tests do. They might take longer than unit tests are supposed to. But it works for me. Writing these "unit tests" doesn't take me a lot of time, so I can write a lot of them. With automated testing frameworks, I can run them often. And it makes catching and fixing bugs a whole lot easier.

Which is better?

Use all of them. Any testing guru will tell you that some testing is better than no testing. And so all the testing is better than some testing. Like Ned Batchelder said:

How often have you heard someone say, "I wrote a lot of tests, but it wasn't worth it, so I deleted them." They don't. Tests are good.

What makes good tests?

  • Coverage is generally a good measure for the quality of tests, i.e., what percentage of code is actually being exercised by the tests.

  • When good tests fail, they tell you exactly where the problem is, down to a few lines of code.

  • Good tests can be run often, so they're not computationally intensive. If testing is going to take me more than a few seconds, I'm not going to do it.

  • Good tests cover the edge cases.

Where to learn about testing?

OK. I've said enough about testing without really quantifying anything. What is a test and how to write one? You can definitely do better than learn from this blog. Here are my favourite resources on testing:

Here is a great article on this subject.

Sep 14, 2014 - Teaching MATLAB

I had the opportunity to introduce MATLAB to two groups of 30 graduate students fitting the following profile:

  1. Non-majors in Computer Science

  2. Novice MATLAB programmers

  3. Programming experience in some language (C/Java)

Learners came in with a broad range of expectations:

  • What is MATLAB, and what are people doing with it?,
  • How to do X with MATLAB?,
  • How can I use MATLAB more effectively than I already am?

I'm a huge fan of Software Carpentry and their evidence-based approach to teaching. The argument makes sense: we are scientists, and founding our teaching methods on hard research--or at the very least, some data--is likely a better idea than using arbitrary lesson plans and teaching techniques. Accordingly, as a starting point, I chose the Software Carpentry lesson material on Python These lessons have the immense value of feedback from several workshops that have been run by trained instructors all over the world. Of course, the first step would have had to be translating all that lesson material from Python to MATLAB. Which I did. This doubled up as my very first contribution to the open-source community (yay).

The concepts covered are, in this order:

  • Loading, analyzing and visualizing data from a file
  • Writing MATLAB scripts and loops
  • How and why to write MATLAB functions
  • Conditionals in MATLAB
  • Writing tests for functions

The lessons are built around a central task: analyzing and visualizing the data from several .csv files. With every lesson, we make our code for doing this a little better. For example, in lesson 1, we load, analyze and visualize the data interactively (from the command line). In lesson 2, we put those commands in a script, and discuss the pros and cons over computing interactively. We proceed to introduce loops, and modify our script to analyze several datasets. And so on.

I've done sessions on MATLAB before, and I used to follow a "textbook" approach: exposing ideas in the same order that they would appear in a textbook on MATLAB, for example:

  • Variables and Statements
  • Vectors
  • Plots
  • Matrices
  • Loops
  • Conditionals
  • Scripts
  • Functions

So, in one of my earlier sessions, the first few lines of code we would type in to the command line would be something like this:

>> a = 1
>> b = 2
>> a + b
>> a * b
>> c = [a, b]

Compare that to the first couple of lines of code that we type in now:

>> patient_data = csvread(`inflammation-01.csv`);
>> imagesc(patient_data)

I think that exposing this sort of powerful functionality early is important: it makes learners feel like "this might actually be worth my time" and encourages them to participate more.

Getting novice programmers to follow along command-by-command is slow: they're going to meet with a lot of errors, even with the simplest of commands. The most common mistakes I've seen learners make in workshops:

  • Typos
  • Calling scripts/functions from the wrong directory

This is natural and expected; a new programming environment takes time to get used to, and learners simply don't have enough context to make sense of error messages. It's tempting then, to demo-ize the whole thing and keep participants from writing too much code. Of course, that's a bad idea, and I prefer an approach that's somewhere in-between:

Commands

Have learners type out commands on the shell while introducing ideas for the first time and demonstrate commands when expanding on them or explaining subtleties.

Scripts/functions

Have learners type out stripped-down, simple versions of more complex scripts. For example, instead of having learners write a script that loops over several datasets, performs analyses, and plots various figures for each, have them type out and execute a script that performs a single analysis on a single dataset, and produces a single figure. Then ask them to look at a more complex version of that script that was distributed to them at the beginning of the session.

I also experimented with a workshop etherpad, and gave learners the option of taking notes there instead of on their personal notepads/computers. Most learners preferred not to interact too much with the etherpad, I'm not sure why - maybe this should be part of the feedback - but here are some possible reasons:

  • Not enough time
  • Not familiar with etherpad
  • Not as convenient/useful as personal notes

I gave participants the option of providing feedback, either on the public etherpad, or on pieces of paper. Feedback on paper was generally more specific and comprehensive. Response was positive, with complaints about not having enough time and not covering enough/specific material. The structure and content of the lessons was generally appreciated, although there were mixed opinions on the section on testing.