CI cost figures are vendor list prices verified April 2026. Actual cost depends on plan, concurrency, and discount terms. Some links are affiliate links. See disclosure.

Last verified April 2026 · 9 min read

Stop running every test on every PR

30-minute test suites running on every PR commit, 80% of which touch a single package. A third of all slow CI is just 5 flaky tests nobody has identified. Test optimisation is the highest-leverage intervention after caching.

§ 01

Strategy 1: Test sharding

Sharding splits a test suite across N parallel runners. A 15-minute test run on 5 shards takes 3 minutes. The total billed minutes are the same, but wall-clock time drops 5x. Trade-off: sharding does not reduce cost, it reduces latency.

strategy:
  matrix:
    shard: [1, 2, 3, 4, 5]
steps:
  - run: npx jest --shard=${{ matrix.shard }}/5
  # Python: pytest --splits 5 --group ${{ matrix.shard }}
  # Go: go test -run TestShard_${{ matrix.shard }}
§ 02

Strategy 2: Flaky test detection

A third of teams' “slow CI” problem is actually 5-10 flaky tests that fail intermittently, trigger re-runs, and consume minutes that could have been prevented. Identifying and quarantining flaky tests before fixing them is the fastest intervention.

Buildpulse

Identifies flaky tests from historical run data. Shows flakiness rate per test, time to fix, and impact on CI time. Free tier available.

buildpulse.io

Launchable

ML-based test selection: predicts which tests are relevant for a given PR based on historical data. Can reduce test execution by 60-90% on mature codebases.

launchableinc.com

Trunk.io Flaky Tests

Auto-quarantines flaky tests to a separate run. Tests in quarantine do not block PR merges but continue to run to detect fixes.

trunk.io

§ 03

Strategy 3: Test impact analysis

Test impact analysis runs only the tests affected by the specific code changes in a PR. Requires a tool that understands your test-to-code dependency map.

Launchable

ML model trained on your test history. Requires instrumentation but works across languages. Best for large mixed-language codebases.

Bazel query

DIY for Bazel monorepos. bazel query 'tests(...)' combined with changed-file detection gives precise test selection without a third-party tool.

Nx affected

Built into Nx. nx affected:test runs tests for all packages affected by changes since the base branch. Zero configuration overhead.

Turborepo

Via task-level caching. If a package has not changed and its dependencies have not changed, the test task is a cache hit and is skipped entirely.

§ 04

Cost worked example

4-DEV TEAM, 20-MIN npm test, 80 PRs/DAY

Baseline: 20 min × 80 PRs1,600 min/day → $320/mo
After fail-fast (avg 15 min)$240/mo
After test impact analysis (Launchable: 3 min avg)$100/mo
After flaky-test quarantine (5 tests removed)$80/mo
Total reduction75% cost reduction