Book: Thinking Fast and Slow


My goodness, this book is dense.  Don’t get me wrong, it’s well written and very accessible – you could take it as a holiday read and “read through” it – but there is 25 years of psychology research packed into a few hundred pages.  We covered this book in our System Test book club, and found that even covering a chapter or two each session, we had plenty to talk about and discuss.

Roughly speaking, the fundamental theme of this book is that we as humans have two systems in our brains, which Kahneman helpfully labels “System 1” and “System 2”.

  • System 1 is fast, reactive, emotional, and runs on assumptions and work that the brain finds “easy” – analogies, associations, stereotypes, anecdotal evidence.  It’s great for letting us deal with day to day life without going nuts.
  • System 2 is the slow careful, rational thinking that we think we are all the time.  It’s great for coming to the logically correct answer, but it’s way too expensive and slow for us to use all the time.

As System 2 is really expensive to run vs System 1, a lot of the time we actually use System 1 with System 2 unthinkingly rubber-stamping the answer.  An early uncomfortable conclusion to the research is that we’re not the rational beings we think we are.  The rest of the book covers ways in which we actually think, and the various heuristics and biases that we engage in, and so on.  Kahneman has spent his life picking these apart and uses this model to give good explanations to why these come about, and what we can (and can’t!) do to try to get to the actual logical answers both at work and in general life.

I thoroughly recommend this book to anyone interested in their development, whether a tester or otherwise  (Kahneman also provides a good bibliography and references if you’re interested in digging further).  And if you’re thinking of starting some kind of discussion group or book club, this is a good “hook” book to get started with.

Using Dungeons and Dragons to understand your motivations and have more fun at work.

My first ever AD&D was a chaotic good wizard with 6 INT. He didn't do very well.
A neutral good tester prepares to unravel the mysteries of the universe (probably by hitting bits of it with lightning).

Hi, I’m Edmund, and I’m chaotic good.  That means if you want me to do something for you, tell me about how much it will help you (or someone else) and enthuse about how new, different, and exciting it is.  If you do that, I’m much more likely to help you out, and I’ll have more fun helping.

This post is about people’s fundamental motivations, how to think about them, and about how 2 minutes of thought and a change of briefing will make your team and your manager work better and enjoy the work they’re doing more.

The model I’m using originally came from my previous boss, Jon Berger (lawful evil), who deserves all the credit here.  It’s based on the alignment system from the Dungeons and Dragons RPG.  It’s a handy model useful because it’s a system for describing basic character motivations that almost every geek you come across already knows it.  If you haven’t come across it, you’re missing out!

AD&D 2nd edition was clearly the best. Thac0 did wonders for our mental maths.
The alignments at AD&D 2nd edition.

Here’s the updated, but similar alignment model I have for people at work.

Fundamental motivations at work

Some more explanation of the two axes.

  • Good/Evil.  People tend to be fundamentally people driven or goal driven.  Most people like both helping others and getting things done, but which actually gives you the real buzz at the end of a project – that 3000 customers are better off or that you’ve created an amazing thing that works really well?  Good people will spot that horrible “Working as designed but it doesn’t solve the customer’s problem” bug, Evil people will make sure everyone focuses and completes everything required to ship the product.
  • Lawful/Chaotic.  People tend to enjoy working with rules, or without them.  Lawful people create strong processes and can be relied on to do everything needed, but can struggle if not given enough structure to build on.  They drive change by defining new methods that people can follow.  Chaotic try more new things and uncover new ideas, but can struggle completing the details that have to be done.  They drive change by trailblazing and championing.

Ok.  So there’s this model.  What can I do with it?

First off – this is a model.  All models are inaccurate.  They’re not a replacement for thinking, but they can help you think about the thing that you’re modelling.  So have a think about yourself and the people you work with and how you fit into this model.

When you want someone to do something.  If you’re asking for help.  If you’re briefing them on a new part of the project.  If you’re trying to help them develop.  Whatever.  Appeal to their fundamental motivation (and crucially don’t assume that their motivations/comfort zones are the same as yours!).  Compare the following briefings.

  • Marty and I are behind on the flux capacitor testing, and that’s the key part that lets our customers reach 88 mph.  We need you to spend a few days helping on this  to pull us out of the hole.  It’s also a chance to play with the flux capacitor and learn more about it.
  • I need you to help get the flux capacitor working – it’s the critical component that makes the Delorean more than just a car with silly doors.  We need you to spend a few days nailing through the following test conditions, and as a critical component it’s of course crucial we have a clear test report.

Of course, the task is the same in both cases, and indeed the detailed task briefing can be identical.  However by emphasising the parts of the project they’ll enjoy and appealing to their motivation, they’ll tie the task to their motivations and (a) put in a better job and (b) get more of a buzz out of the result.

As a final note, if you’re at all like me, when you first try this (and I recommend you do try!) you’ll feel a bit like a machiavellian manipulative fraud.  However, there’s nothing secret about this – point them at this post and let them think about what motivates them too.  Discuss it. Compare and contrast your motivations.  It might help both of you decide how to choose which tasks to pull off the pile in the next scrum kickoff meeting you’re in.

Boost your development: “Do one thing”

Eating one of these every day may not help you become a better tester.
A Boost.  Eating one of these a day is not guaranteed to develop your testing.

One of the tricky things when trying to improve yourself is connecting a high level goal to specific day to day actions that will get you there.  How do you get from “Within 9 months I want to be a lead tester that people seek help from.” to “Here’s what I have to do this week to achieve that.” ?  Lots of people seem to end up doing their job and hoping that intention and osmosis will get them there eventually.

Here’s a simple thing that I’ve found works for me.  I also use it with people I manage and it’s a good way to frame development as something that happens all the time rather than just as part of the annual/6 month performance review.

Each Monday, decide one extra thing that you’re going to do *this week* to work towards your goal.  It has to an explicit action, that you can tick off (in other words, make it SMART) and it has to be something that isn’t just part of doing your job well, and it has to be something that will help you reach your goal (even if it’s not obvious that it’s the best thing to do).

Tell someone what you’re doing (telling your manager in your one-on-one is great, as it reminds them that you’re pushing your development and willing to do more than “just your job”,  and it also gives you both a chance to discuss/agree other actions too).  Then do it.

The actions don’t have to be big – for example…

  • I will ask Dave the Developer to review the structure of the test script functions that I’m writing, and I’ll find at least one thing in his feedback that I can apply next time I write scripts.
  • I will read James Bach’s blog and brief the team about one thing that I’ve learned and applied.
  • I will spend an hour paired testing with Laura the Lead Tester, and note at least two things that she does that I should regularly do when testing.  I’ll explicitly do those later this week.
  • I will review my notes taken during testing sessions on Wednesday and Thursday, and find 2 holes where I should go back and retest.

Also, note that some of those examples involve other people.  You’ll need to check with the other people that they’re up for helping you, but if you ask people for small specific bits of help to learn to be good like them, they almost always say yes.

The difference between Nearly Clean and Really Clean

LADEE in the clean room, presumably unpowdering her nose
Really Clean!

Toothpaste adverts leave no doubt about how much “really clean” matters, even when the actual difference is beyond the powers of human perception.  But for regression suites this can really make the difference between a useful set of checks that make the product better and easier for everyone to work on, and a millstone that wastes time and drags down morale.

Until recently, I was heading up the team responsible for system testing our network protocol stack code.  We had some decent test tools (barring some historical idiosyncrasies) which let us “cable up” and spin up a whole network of VMs (actually containers these days)  and it was easy to create a script to run through a bunch of checks based on that network.  So we ended up with a lot of regression scripts that checked all our function over a wide range of our products.  And our scripts had a pretty good false positive rate (mostly <1%) – we were nearly clean.  So surely we were sitting pretty?

No.  We had a lot of scripts (1000’s) and we had enough false positives that noone really trusted the scripts when a check failed.  We had someone spending an hour a day looking over the “fail” results.  And mostly we decided to “wait and see if it happens again”.  And if we did suspect a bug and send off to whoever had made changes the previous day, their response was usually “don’t think it’s me, probably a false positive”.  And because we’d waited a day or two different people had made changes and squabbled over whose fault it probably was and so who should investigate first.  We had a load of automated checks that drained a load of time, and despite repeated “quality pushes”, the average number of failures (and the false positive rate) slowly ticked up over time.

Our one saving grace was a couple of suites of scripts, which were really clean.   The false postive rate was very low (<0.1% or so) and crucially, it was low enough compared to the number of scripts that when people saw a check failure their default assumption was that there was a bug.  People dug into every check fail when the issue was new and fresh, and if they did find a false positive, they fixed up the script, so our false positive rate slowly got better and better.

And that for me is the big difference.  If your suite is “really clean” so the default expectation is that failed checks in your regression suite indicate product bugs (which had better be because that’s true), then whatever you do, your suite should improve and get cheaper to maintain.  Conversely, if the default expectation is that there’s a good chance of a false positive, then it doesn’t matter how “nearly clean” your suite is, over time it will get worse and your maintenance will get more and more expensive.  (As a side note – Michael Bolton has a good post on what actually happens  when you see a check failure.  The whole point to getting “really clean” is to get to the point where it’s a reasonable working assumption that the issue is in the product, which saves you ever-so-much work).

Over the last few years we slowly moved a lot of our regression suites from “nearly” to “really” clean.  It took a lot of time and effort, but it’s paid off.  We used to burn something like 50 days/year just on maintenance and we’re now probably down to a tenth of that.  And that’s just raw maintenance time – not including savings on bug fixing.

So how do you get from “nearly clean” to “really clean”?  Some thoughts based on our experience of improving our scripts.

  • Expect to put a lot of effort in.  With 200 scripts, <1% false positives means still one or two a night.  You need to get down an order of magnitude better than that before people will expect issues to be in their code and not the scripts.
  • Focus on one area at a time.  Getting one area over the “really clean” hump wins you more than getting everything a bit closer to nearly clean.  This doesn’t even need to be a particular product or functional area.  If you have 200 scripts, then break out the 20 best ones into a separate output and call that an area – and move other scripts over as and when you get them working well.
  • The key test for “really clean” is the belief and trust in people’s minds.(which can be irrespective of the actual level of cleanness but may be helped by seeing the enforcement).  Make sure that you’re clear about what output is “really clean” and what isn’t, to make sure that you can build that trust.
  • Hold that “really clean” quality bar hard.  Inevitably people will mess things up, but we found you can’t let things slide even for “special one-off” reasons.  We used a “fix it up or back it out” policy which worked well for us.