Parallelizing Jest and Cypress.io Tests on CircleCI

At Artsy, exploring ways to improve the developer experience is part of our makeup. Whether it’s implementing hot-swapping for Express.js or integrating the Rust-based SWC compiler into our front-end build pipeline, we’re always trying to reduce the amount of time it takes for a code cycle to take place. CI is no exception. When a developer opens a PR, we want to ensure they get timely feedback. Do their unit tests pass? Does the app build correctly? And how about smoke tests? Each of these jobs are complex processes that take time, and the more one can parallelize said tasks the less devs will need to wait. Scaled out to a whole engineering org, minor improvements to CI can be radical.

In this regard, two things came across our radar recently that we’d like to share: sharding via Jest, and a (free) way to parallelize Cypress.io integration tests via CircleCI’s split command.

Sharding in Jest

“What is sharding?” Good question! In short, it means “a small part of a whole”. The database community has employed sharding techniques for decades, where a large database is split up into smaller, more manageable chunks, usually to improve performance at scale. The same idea can be applied to any process or task involving a lot of data, including tests.

Think about it like this. Imagine an app that has thousands of tests. One can open up their terminal and run yarn test and execute all of the tests at once in a single process, or one can open two terminal tabs and run yarn test src/utils and yarn test src/routes, and have both processes allocate a pool of memory to complete each (smaller) subset of tasks. Because each process has its own memory pool the performance characteristics are generally better, and thus the overall time required to run our tests is reduced / decreased. Running each of these commands scoped to a particular folder is easy enough, but in a CI environment this is somewhat cumbersome; we’d need to define two new jobs and then the conditions in which they run, increasing the scope and complexity of our configuration file.

This is where Jest’s new sharding feature comes into play, which taps nicely into most modern CI runners. Using a hypothetical app containing 100 tests, here’s a quick example of how it works:

$ yarn jest --shard 1/5

What this says is: take the total number of tests (100), divide them into five buckets (containing 20 tests each), and execute the test runner against the first bucket (the first 20 tests). Continuing:

$ yarn jest --shard 2/5

Now take the second bucket and execute the next 20 tests – and so on. Simple enough.

Taking this further, we could turn this into a bash loop, including an & symbol to run things in parallel and automating some of the redundancy away:

BUCKETS=5

for i in {1..${BUCKETS}}
do
   yarn jest --shard $i/$BUCKETS &
done

For many the above snippet should be sufficient to speed up your test suite, but who wants to write bash loops? Thankfully, most modern CI task runners contain the ability to split jobs into separate processes programatically and so this kind of logic is unnecessary.

Here’s how to do this in Circle CI:

test:
  parallelism: 5
  steps:
    - run: yarn test --shard=$(expr $CIRCLE_NODE_INDEX + 1)/$CIRCLE_NODE_TOTAL

Set a parallelism value, and drop the jest command into a cool one-liner. The variable CIRCLE_NODE_INDEX refers to which container index the job is running on, and CIRCLE_NODE_TOTAL points to the value of parallelism.

On Artsy.net, we’ve been able to reduce the average time it takes to run our unit tests from around ~10 minutes per PR to just above 2m. A 4-5x performance improvement.

Parallelizing Cypress.io Integration Tests (For Free)

For those who want robust integration test coverage, Cypress.io has been a game-changer due to its reliability and ease of use. Here at Artsy we use it in a number of apps, most notably Integrity. One complaint, however, is just how slow it is. This is reasonable; Cypress is simulating a user browsing your website and sometimes a user needs to do x and y (such as logging in) before they can do z. At scale this can really slow things down and lead to bottlenecks, especially if deploys are dependent on all of your integration tests passing.

The Cypress.io team has recognized this bottleneck and released the Cypress Dashboard, a paid product which includes the ability to unlock parallelized tests on your CI. For those willing to pay for another SAAS product this will get the job done well, but for those with leaner budgets there’s another way to accomplish this for free, and on CircleCI it’s very easy to setup via the CircleCI CLI command split.

You can check out the full example here, but in short:

integration:
  parallelism: 5
  run: |
    TESTS=$(circleci tests glob "cypress/integration" | circleci tests split | paste -sd ',')
    cypress run --spec $TESTS

We use the circleci tests glob command to gather all of our tests, and then pipe that into the circleci tests split which will divide our tests into buckets, similar to how Jest’s --shard command works up above. We then assign that to a $TESTS variable and pass it into cypress run --spec $TESTS. CircleCI sees the parallelism prop in the config and automatically divides our tests into 5 separate containers, each running a small subset of our integration tests in parallel.

On Artsy.net, our smoke tests times have gone from around ~7m on average down to ~3m. A huge reduction for only a few lines of config!