Performance work has a failure mode: you reach for the most exciting tool first. On InstrumentStudio — an enterprise platform for testing electronic components — the wins came from boring decisions made in the right order.
Model the work, then pick the primitive
Measurement runs produce extensive datasets. The instinct is "parallelise everything." The reality is that most latency lived in two places: I/O waiting, and recomputing things we already knew.
- For the I/O-bound paths,
async/awaitend to end removed thread-pool starvation under load. - For the genuinely CPU-bound dataset transforms,
Parallel.ForEachwith a sane degree-of-parallelism beat hand-rolled threads. - For the rest, intelligent caching of derived results was the single biggest lever.
// Cache the derived view, keyed by the inputs that actually change it.
var summary = _cache.GetOrAdd(run.Fingerprint(), _ => Summarise(run.Samples));
Tests are how you go fast
We grew automated coverage from 45% to 85%, and regression defects fell ~40%. That isn't a vanity metric — coverage is what let us refactor the concurrency model without fear. You cannot safely make code parallel that you cannot safely change.
The takeaway
A 30%+ efficiency lift sounds like a trick. It was mostly removing work: don't recompute, don't block, and don't ship a change you can't verify.