0x016 - Mutation Testing 🧟

0x016 - Mutation Testing 🧟

Hi, y’all! Welcome to unzip.dev, a newsletter dedicated to unpacking trending developer concepts. I’ll be posting every few weeks, so if you haven’t subscribed yet, go for it!

Mutation Testing

Synonyms: code mutation testing, mutation analysis, fault-based testing, part of white-box testing.

Who is this for?
- Anyone writing unit tests.


  • Problem: How can we test our tests?
  • Solution: Modify the code with faults and check if the tests fail.
  • In Sum: Mutation testing is a critical missing piece of any serious testing. It seems like the concept is picking up, but might be an overkill for small projects.

How does it work? 💡

A mutation testing tool creates a “mutant” (a faulty, modified version of your original source code).

Mutants rely on mutation operators that mimic programming errors like dividing expressions by zero, using the wrong operator or variable name. Sometimes even creating dead-code (code that never executes).

The technologies used to mutate the code use ASTs, CFG, and more

For example, if you have a code block like:

if (diff > threshold) { ...

A mutant might look like this (note the flipped greater-than sign):

if (diff < threshold) { ...

At this point, your unit test should probably fail. If it doesn’t then we’ve got a problem with our test, and we need to figure out how to write a better one.

To gauge the quality of our tests we use something called the mutation score - which is the percentage of mutants (faulty versions) that were killed by the unit test.

Smart mutation testing frameworks like pitest will run code coverage first (if the code in question isn’t covered, why are we testing it with mutations?) and then wisely mutates only relevant places.


  • Regression testing vs Mutation testing? Regression tests are checking if new code is buggy (broken old behavior), while mutation tests are checking whether our tests themselves are reliable enough. In addition, note that regression testing does not generate tests automatically like mutation testing does.

Why? 🤔

  • Identify weak tests: Find which tests are stable and work correctly even when the underlying code changes behavior.
  • Automatic: Mutations tests do their work automatically and can be integrated into your CI.

Why not? 🙅

  • Costly: It takes time to generate and run the mutant programs, so it is advised to run them only when your code changes. Also, try running them on a RAM disk to make them run faster.
  • False positives: Report fatigue can be a real issue, with some mutation frameworks producing findings that can only be described as lacking.
  • Overkill: If you’re creating an MVP or some software that isn’t super critical and needs to be shipped fast, double-check the ROI of mutation testing before you integrate it.

Tools & players 🛠️

My opinion: It might sound silly, but I thought about this as a business idea a few years back. I called it “chaos testing” (stole from Netflix), and with a bit of research I realized it is a full-on concept - wish me luck on my next unicorn idea 😅 In all seriousness though, why not incorporate some mutation testing to your projects? At least as an opt-in option locally for developers that want to check their tests before pushing?

Forecast 🧞

2020 until today (Google Trends), a steady increase.
  • Market adoption: Mutation testing was originally proposed in 1971 but didn’t gain traction because of the high costs back then. Now it seems to steadily increase in popularity, especially with fuzzing getting into the mainstream of testing.
  • AI Testing: By checking our tests, we can make sure that automated test generation is more robust. Which, I believe, will push automated test generation forward - who likes writing tests anyway?
  • Faster and better: With a growing activity pushing mutation testing and better LLMs that understand code (and write better mutants). I can see how most of the cons of mutation testing can be a thing of the past.

PS. This isn’t a hockey stick trend - so I’m taking anything I’m writing here with a grain of salt, and you should too ;)

Examples ⚗️

Try it out yourself?


Additional information that is related:

Thanks 🙏

I wanted to thank @TomGranot (the best growth person I know), @AndyKatz (Helps organizations gain competitive advantage through intelligent technology - DM him for more, you won’t regret it 😉)


(Where I tend to share unrelated things).

So the way I came up with this trend was by starting to categorize my content into categories. What I found was that I only had one “Testing” tag (0x009 Property-based Testing), so I decided to add more content to it.