Practical Accessibility in Web Development

Seite 2: Automated Accessibility Testing Tools

Inhaltsverzeichnis

Automated tests are programmed evaluations of individual components and complete application workflows that run without manual intervention to ensure consistent and efficient testing. They allow for integration into existing testing frameworks and version control systems, streamlining the workflow and promoting a more consistent and re-traceable approach to accessibility.

General testing libraries can be utilized for accessibility tests, e.g., to ensure that a button responds to both click and enter keystroke events and that a user can navigate to it via keyboard (WCAG 2.1 Keyboard Accessible). However, rather than reinventing the wheel and brainstorming the necessary test cases, developers can take advantage of specialized accessibility libraries that address common accessibility requirements. The following examples showcase various libraries, testing engines and tools, and illustrate how they support the testing process.

Testing Library offers three functions for accessibility testing: getRoles(), logRoles(), and isInaccessible(). It advocates for tests that closely resemble real-world usage of web pages, focusing on queries that cater to both visual/mouse users and those using assistive technology, such as getByRole(), getByLabelText(), and getByPlaceholderText(). Additionally, developers can use a keyboard testing extension to simulate the behavior of keyboard-only users.

Accessibility rulesets are at the heart of several testing tools. These rulesets, based on standards such as WCAG and Accessible Rich Internet Applications (ARIA), constitute the underlying test cases. The most well-known is axe-core by Deque Systems, which is a widely used accessibility engine for web UI testing. Developers can integrate it with various test runners and frameworks like Jest (jest-axe), Ember (ember-a11y-testing), React (axe-core/react), and Vue (vue-axe).

axe-core can also be used in end-to-end tests with tools like Cypress (cypress-axe), Selenium (axe-core-selenium), and Playwright (axe-core/playwright). The tests profit from an extensive and customizable rule collection, and can be configured to target specific WCAG versions and the conformance levels A, AA, or AAA (see info box "WCAG"). Further accessibility testing engines and rulesets include the Accessibility Checker engine by IBM, WAVE Stand-alone API and Testing Engine by WebAIM, Tenon/Access Engine by Level Access, ARC by TGPi, and the Alfa accessibility checker by the software company Siteimprove.

Web Content Accessibility Guidelines (WCAG)​

The Web Content Accessibility Guidelines (WCAG) 2.1 are systematically organized into three tiers of compliance to cater to a range of user groups and situations. These tiers include Level A (basic), Level AA (intermediate), and Level AAA (advanced). Each higher level indicates accessibility to wider user groups and inherently satisfies the requirements of the lower levels. For instance, if a webpage adheres to Level AA, it automatically meets the standards of both Level A and Level AA. There are 25 criteria to be met for Level A. Level AA requires 13 additional criteria. Level AAA is the icing on the cake with 23 more criteria.

Guidepup is a screen reader driver for VoiceOver (macOS) and NVDA (NonVisual Desktop Access, Windows), designed to help developers navigate with screen readers just as users do. It helps to assert what users truly hear when using screen readers, all in an automated manner. Further tools in this group include JAWS Inspect, Screen Reader Testing Library for NVDA, and Assistive-Webdriver (JAWS and NVDA). Developers must have some understanding of how screen readers work, though, to be able to write correct test assertions.

The major limitation of the above tests and browser extensions is that developers have to trigger the audits manually. In other words, the code is fine as long as no one tests it. Linters can address this shortcoming.

A linter is a static code analysis tool that helps developers identify potential issues in their code by checking it against a set of predefined rules or best practices. Developers can integrate linters into their workflow, where they run automatically on saving changes or as part of a continuous integration (CI) pipeline.

There are several accessibility linters available, each with its own set of rules and best practices. Some examples include:

A look at the linter rulesets sheds light on the established best practices for accessible websites. For example, several linters based on different rulesets choose to implement a no-positive-tabindex rule. This rule requires all elements that use the tabindex property to set it to 0 or negative values.

<div tabindex="0">Tabbable due to tabindex 0.</div>
<div>Not tabbable: no tabindex.</div>
<label tabindex="-1">Not tabbable: excluded due to -1.<input type="text"></label>

This is important because elements with a positive tabindex become the first things users tab to in a web page. In this case, the keyboard navigation and screen reader announcements will differ from the visual (and logical) order of elements, which might confuse the users.

Another rule implemented in several linters concerns the usage of autofocus. Automatically focusing a form control can confuse visually impaired people using screen-reading technology and people with cognitive impairments. When autofocus is assigned, screen readers "teleport" their user to the form control without warning them beforehand.

In addition, many rules check the implementation of Accessible Rich Internet Applications (ARIA), such as aria-roles, aria-props, and aria-unsupported-elements. These rules control the usage of ARIA attributes by ensuring the correct values and value types are in place.

Linters can warn developers about accessibility issues, but they cannot prevent them from ignoring these warnings and shipping the code to production. The next step to enforce accessible implementation would be to back linting into the pre-commit hook, and thereby blocking commits with identified issues. Task runners and npm scripts can perform the same function. There are several plugins for Gulp and Grunt, based on the Tenon and axe-core rulesets.

While the team could decide on their own whether they want to use browser extensions and linters, the next step requires the first slight change in prioritization of accessibility issues. Can they be a reason for release delay? If yes, what tooling can enforce it?