Scripting WebPageTest for Frame Budget Regressions
Scripted WebPageTest runs drive multi-step flows, capture the full main-thread trace, and expose custom metrics you compute yourself — letting CI assert a specific interaction’s long-task cost against the 16.6ms frame budget rather than a single summary number. This builds on Lab Tooling and CI, part of Rendering Performance Metrics and Tooling.
Why Scripting, Not a Single URL
A one-URL audit measures cold page load. Real frame-budget regressions often hide behind interactions: a click that triggers a 90ms filter, a tab switch that forces a layout flush. WebPageTest’s scripting language navigates to that state, marks the step boundaries, and records a trace for each step, so you can attribute main-thread time to the exact interaction instead of averaging it into a page-level total.
The Scripting Language
WebPageTest scripts are line-oriented commands. The ones that matter for frame-budget work are navigate, exec (run JS in the page), setEventName (label a measurement step), and execAndWait (run JS and wait for activity to settle).
# wpt-search.txt — script a load, then a search interaction as its own step
logData 0
navigate https://example.com/
logData 1
setEventName SearchInteraction
execAndWait document.querySelector('#q').value='laptop'; \
document.querySelector('#q').dispatchEvent(new Event('input'))
logData 0 suppresses metrics during setup; logData 1 re-enables them so only the search step is measured. setEventName makes the step show up as a discrete entry in the result, with its own filmstrip and trace.
Custom Metrics from the Trace
WebPageTest lets you declare custom metrics as JavaScript that runs at the end and returns a number. To assert against the frame budget you need the longest task and total main-thread blocking for the interaction step, which you derive from the long-task entries the trace recorded.
[Custom Metrics]
longestTask
return performance.getEntriesByType('longtask')
.reduce((max, t) => Math.max(max, t.duration), 0); // ms of the worst block
totalBlocking
return performance.getEntriesByType('longtask')
.reduce((sum, t) => sum + Math.max(0, t.duration - 50), 0); // TBT-style sum
These surface as longestTask and totalBlocking in the JSON result alongside the filmstrip and the raw trace, so a CI step can read them and compare against budgets.
Asserting Against the 16.6ms Budget
The result JSON is fetched by the test API and checked in CI. The frame-budget assertion is simply: did any task in the interaction step exceed 16.6ms?
// CI gate: fail if the interaction step blocked a frame
const result = await fetchWPTResult(testId) // poll the WebPageTest API
const step = result.data.median.firstView.SearchInteraction
const FRAME_BUDGET = 16.6
if (step.longestTask > FRAME_BUDGET) {
console.error(`longest task ${step.longestTask}ms > ${FRAME_BUDGET}ms budget`)
process.exit(1) // non-zero blocks the merge
}
Reproduction: A Regression in the Interaction Step
// The search handler does a synchronous, layout-reading filter in one frame
input.addEventListener('input', () => {
for (const row of rows) {
row.style.display = matches(row, input.value) ? '' : 'none'
void row.offsetHeight // ❌ forces a layout flush every iteration — long task
}
})
The per-row offsetHeight read interleaves a layout flush with every write, turning the loop into one long task. The scripted WebPageTest step captures it:
[WebPageTest trace — SearchInteraction step]
Main thread:
├─ Event: input ......................... 0.4ms
├─ Task (filter loop) .................. 88.0ms ▣ LONG TASK
│ └─ interleaved Layout × 240 rows (forced sync layout)
└─ Paint ................................ 5.0ms
Custom metrics: longestTask = 88 totalBlocking = 38
Frame budget 16.6ms exceeded → CI assertion FAILS (88 > 16.6)
The Fix
Separate the reads from the writes so layout flushes once, not per row, and the loop no longer blocks a frame past budget. This is the standard remedy for a forced synchronous layout — batch the measurements, then batch the mutations.
// ✅ One layout flush for all reads, then all writes — no per-row sync layout
input.addEventListener('input', () => {
const visible = rows.map((row) => matches(row, input.value)) // reads only (single flush)
rows.forEach((row, i) => {
row.style.display = visible[i] ? '' : 'none' // writes only — no interleaved reads
})
})
The re-run trace shows the loop split below the frame budget and the custom metrics back under threshold, so the gate passes. WebPageTest’s per-step attribution is what made the regression visible at the interaction level; correlated with field data, the same stall would appear as a long animation frame with the handler named in its scripts array.
Verification Checklist
| metric | target | how measured |
|---|---|---|
longestTask (interaction step) |
< 16.6ms | WebPageTest custom metric |
totalBlocking (interaction step) |
< 50ms | WebPageTest custom metric |
| Forced layout count in step | 0 | trace inspection of the step |
| CI exit code | 0 on fixed build | result-JSON assertion |
- The
SearchInteraction -
longestTaskandtotalBlocking