Lab Tooling and CI
Lab tooling runs the rendering pipeline under controlled, repeatable conditions so a frame-budget regression is caught in continuous integration instead of in production. Lighthouse CI, WebPageTest scripting, and performance budgets turn the metrics you observe in the field into hard pass/fail gates on every commit. This is part of Rendering Performance Metrics and Tooling, and it is the gate that catches the regressions the field instrumentation in PerformanceObserver API Patterns would otherwise only report after release.
Lab Versus Field
Field data β Core Web Vitals collected from real users through observers β is the source of truth for what users actually experience, but it arrives after release and is noisy with device and network variance. Lab data is synthetic: a fixed device profile, throttled CPU, and a simulated network, run on demand. The trade is reproducibility for realism. You use the lab to fail a pull request deterministically; you use the field to confirm the fix moved the distribution. The two are complementary, and the metrics line up β lab Total Blocking Time predicts field INP, lab CLS predicts field CLS.
| tool | layer | best for |
|---|---|---|
| Lighthouse CI | synthetic audit | per-commit pass/fail on CWV-style metrics |
| WebPageTest | synthetic, scripted | multi-step flows, main-thread and long-task traces |
| Performance budgets | assertion layer | hard limits on metrics and resource bytes |
Performance Budgets
A performance budget is a number a metric must not exceed, enforced as a build failure. Budgets come in two flavours: timing budgets (TBT < 200ms, LCP < 2.5s, CLS < 0.1) and resource budgets (script < 170KB, total < 1.6MB, request count < 50). Timing budgets guard the experience; resource budgets guard the cause, since bytes shipped is the leading indicator of main-thread work and therefore of long tasks. Both belong in CI so a 40KB dependency bump that pushes TBT past the frame budget is rejected at the pull request, not discovered in next weekβs field data.
[Budget assertion on a regressing commit]
metric baseline this build budget result
TBT ........... 140ms 270ms 200ms β FAIL
LCP ........... 2.1s 2.2s 2.5s β pass
CLS ........... 0.04 0.05 0.10 β pass
script bytes .. 150KB 198KB 170KB β FAIL
β CI exits non-zero, merge gate blocks
Lighthouse CI
Lighthouse CI wraps the Lighthouse audit engine for automation: it collects N runs (medianing to dampen variance), asserts the results against a config, and optionally uploads reports to a server for trend tracking. The assertions are where the budget lives β you declare each metricβs allowed maximum and the run fails if the median exceeds it.
// lighthouserc.js β the assertion config that turns an audit into a gate
module.exports = {
ci: {
collect: { numberOfRuns: 5 }, // median of 5 dampens CPU-throttle noise
assert: {
assertions: {
'total-blocking-time': ['error', { maxNumericValue: 200 }], // TBT budget
'largest-contentful-paint': ['error', { maxNumericValue: 2500 }],
'cumulative-layout-shift': ['error', { maxNumericValue: 0.1 }],
},
},
},
}
The full configuration β including budget.json resource limits and the GitHub Actions wiring β is covered in Automating Lighthouse CI Performance Budgets.
WebPageTest for Frame-Level Detail
Lighthouse summarizes; WebPageTest dissects. Where Lighthouse gives you a TBT number, a scripted WebPageTest run gives you the full main-thread trace, a long-task breakdown, a filmstrip, and custom metrics you compute from the trace yourself β letting you assert directly against the 16.6ms frame budget on a specific interaction in a multi-step flow. Scripting also lets you measure pages behind login or deep in a funnel, which a single-URL audit cannot reach. The scripting language and the trace-extraction patterns are detailed in Scripting WebPageTest for Frame Budget Regressions.
Catching Frame-Budget Regressions Before Deploy
A regression worth gating is one where a frameβs main-thread work crosses 16.6ms and starts dropping frames. The CI flow that catches it:
- Build the production bundle exactly as it ships.
- Serve it locally and run Lighthouse CI five times against the target URLs.
- Assert TBT, LCP, and CLS against the timing budget, and assert
budget.jsonagainst the resource budget. - Run a scripted WebPageTest pass for any interaction-heavy flow and assert the extracted long-task total against the frame budget.
- Exit non-zero on any failed assertion so the merge gate blocks.
[CI run on a pull request β frame-budget regression caught]
step 1 build ................... ok
step 2 lhci collect (5 runs) ... median TBT 270ms
step 3 lhci assert ............. β total-blocking-time 270 > 200
step 4 wpt longtask assert ..... β longest task 92ms > 16.6ms budget
step 5 exit 1 .................. merge BLOCKED
Metric Targets
| metric | target | how measured |
|---|---|---|
| TBT | < 200ms | Lighthouse CI median of 5 |
| LCP | < 2.5s | Lighthouse CI assertion |
| CLS | < 0.1 | Lighthouse CI assertion |
| Longest task in a flow | < 16.6ms | WebPageTest trace extraction |
| Script transfer size | < 170KB | budget.json resource budget |
With these gates in place, regressions surface on the pull request that caused them. The lab numbers asserted here are the same ones the field observers in Core Web Vitals Measurement confirm once the change reaches real users.