Is that responsive layout failure real or just a false alarm? We built VERVE to tell you automatically!
Introduction
Responsive web design is essential for modern web pages that must look great on everything from a phone to a desktop monitor. But testing responsive layouts is tricky. Tools like ReDeCheck can automatically detect potential responsive layout failures (RLFs) by analyzing the document object model (DOM), but they often flag issues that are not actually visible to humans — such as two transparent elements whose bounding boxes technically overlap but whose content does not interfere. This means developers must manually sift through every report to decide what is a real problem and what is just noise.
My colleagues and I built a tool called VERVE — short for Visual classifiEr for ResponsiVe tEsting — that automates this tedious classification process. In (Althomali, Kapfhammer, and McMinn 2021)
Key Contributions
VERVE Tool: We introduce VERVE, which extends our earlier VISER tool to classify all five types of responsive layout failures reported by ReDeCheck. VERVE uses two complementary techniques: manipulating the opacity of HTML elements and histogram-based image comparison.
Comprehensive Empirical Evaluation: We evaluate VERVE on 45 web pages containing 469 potential RLFs. For element collision, element protrusion, and viewport protrusion failures, VERVE’s classification agreed with human classifications up to 91.8% of the time on new subjects not used during the tool’s development.
Efficiency: VERVE took on average about 4 seconds to classify any individual RLF, making it far faster than the manual process of opening a browser, resizing the viewport, scrolling to the failure location, and visually inspecting it.
Publicly Available Tool: VERVE is available on GitHub, enabling other researchers and web developers to use it in their own testing workflows.
Empirical Results
Our evaluation showed that VERVE performs well across all five RLF types, though with varying levels of agreement with human classifications. For the original 25 web pages used in the conference version of the paper, VERVE achieved 86.3% agreement for collision, protrusion, and viewport protrusion failures. For wrapping failures, the agreement was 78.6%, and for small-range failures it reached 98.5%. When we tested VERVE on 20 entirely new web pages, it achieved 91.8% agreement for the first three failure types and 73.6% for small-range failures. Since reviewers sometimes disagree on borderline cases, the results also demonstrated that VERVE’s automated approach can be less subjective than manual classification, thereby demonstrating the tool’s usefulness in practice.
Future Work
We plan to enhance VERVE’s classification accuracy for wrapping failures and to integrate it more tightly with other responsive web testing tools. Future work will also explore how VERVE could be combined with our LayoutDR repair tool to create an end-to-end pipeline that detects, classifies, and fixes responsive layout failures automatically. As part of future work, we are also investigating ways to implement VERVE and affiliated tools like ReDeCheck into extensions for popular browsers like Chrome and Firefox.
If you develop or maintain responsive web pages, I encourage you to read (Althomali, Kapfhammer, and McMinn 2021)