Stryker in a CI/CD pipeline

In my last blog I wrote about why mutation testing is a better way of measuring test completeness. I also mentioned the drawback regarding performance.

In this blog I’d like to discuss what Stryker does to increase performance and the things you can do in your pipeline to get the most out of mutations testing every check-in.

What stryker already does

Concurrency

By default Stryker takes half of the available cores to do the work. Use the concurrency to use more or less cores of you system. I tested this on a 8 core machine. When using 4 core the whole process takes 02:42. Whit 8 cores that same process took almost 30 seconds less, 02:18.

Utilise code coverage

Initially Stryker runs all test and measures the code coverage for each test. It does so to do 2 things:

Any code that is uncovered does not need their mutations tested. Those mutants will surely survive. No need to spend CPU cycles on that.
Stryker keeps track of which tests actually cover which code. This means that it knows exactly which test to run per mutation made. That’s a lot more efficient than running all tests for each mutation.

This is also a reminder to be sure to make tests to be as specific as possible. A couple of tests that hit a specific piece of functionality (and thus code) will run through mutations faster than larger tests that all hit a large piece of functionality and code.

Stryker does provide a option to tune the way it measures coverage.

I might be worthwhile to experiment with this. On one particular piece of code i tried perTestInIsolation was slower than perTest. The analysis took 3,5 minute (InIsolation) instead of 20 seconds. In the execution it only saved 14 seconds. Most likely due to the fact that the more thorough analysis discovered more parts of code that were actually uncovered.

My best guess is that with ‘lower’ scores perTestInIsolation takes less time in total. Please do the experiment on your own code to determine the best setting in your context.

Since

Stryker can be configured to only mutate changed code. Using Since you can have Stryker to only include changes since an certain commit-isch (i.e. a branch, a tag or commit) that are made. By default it will use the master branch. The minimal requirement is to enable it in the config. Here is an example of minimal config that relies on main as the default branch.

{
  "stryker-config":
  {
    "thresholds": { "high": 100, "low": 100, "break": 100 },
    "since": {
      "enabled": true,
      "target": "main",
      "ignore-changes-in": ["**/*stryker-config.json"]
    }
  }
}

Since does come with a drawback. Stryker will only report about the new code.

Baseline

Baseline is similar to since. With the addition that it does generate a full report. It is still experimental and you do need a place to store the baseline.

An efficient pipeline

To get fast feedback double check how many cores are available to the process that runs your pipeline and to set the concurrency accordingly. You might even decide to invest in more cores for that pipeline.

When using feature branches you could use since to compare a feature branch to the target branch. Then on your target branch run a full analysis on each merge.

I greatly prefer trunk based development where each push to main triggers the pipeline. So ideally I want my pipeline to compare against the previous pipeline run. This is not necessarily the previous commit though. I think this can only be achieved by using Stryker’s experimental baseline functionality. A less sophisticated solution is to just assume a single commit per push. This allows you to use the parent commit of the current commit that is built to pass to to the stryker command, like so:

dotnet stryker -f stryker-config.json --since:82d8b49fb871b1cbe5d5027c0275f76da00b78b5

Remember that since does not generate full reports. To get these you could add a pipeline that runs once every day to get the full report.

Another approach (which works for both feature branches and trunk based) is to split pipelines so that you can enjoy deploy to a test environment quickly and still get the full mutation results later. For my previous client that is exactly what we did, see figure below.

On each commit a pipeline is triggered which executes all tests, in parallel triggers the mutation pipeline and then does a release build. The result is that we are able to quickly deploy to a test environment and still get all of the feedback a later.

Yet another approach might be to use since in your every day commits and run a full mutation test every night.

Conclusion

It’s a balance between the speed, quality and costs

Be sure to have small ans specific tests
Use all your cores and optionally even invest
Experiment with the how Stryker should measure code coverage
Use since or baseline to only mutate new code
Be creative with pipelines to get the feedback that suits you best at the moment it suits you best