In the beginning, the question was whether people were even capable of using Web sites. Today the answer is "yes," at least most of the time. When we told people to go to a specific site in the user testing for this book, they completed their tasks successfully 66 percent of the time. Of course, they also failed 34 percent of the time, but on average people did succeed.
There are more than a billion users on the Internet, so any site that has less than ten million customers (in other words, almost any site) has not tapped into 99 percent of the potential audience.
Why do people use the Web if they fail a third of the time? Because in reality, they don't fail that often. The failures occur when people use new sites, but most people spend a lot of their time on sites that have proven useful in the past, so their success across a day of Web use is actually higher. Because users choose sites to spend time on based on their prior experience with them, those with high usability have a better chance of being selected. Furthermore, success breeds success: Users get better at using sites that they visit habitually. For example, if you have already bought nine books on Amazon.com, it'll be easier for you to buy the tenth than it was to buy the first.
It may be little comfort to learn that users' overall experience is better than indicated by our statistics, though, because a site's only hope of attracting new customers depends on how easy it is to use during that all-important initial visit. There are more than a billion users on the Internet, so any site that has less than ten million customers (in other words, almost any site) has not tapped into 99 percent of the potential audience.
The 66 percent success rate we measured in our study is actually a great advance over the miserable usability that characterized the Web in the 1990s. At that time usability studies regularly measured success rates at around 40 percent, meaning that more people failed than succeeded at using the Web.
The Measure of Success
We define success rate by the percentage of progress users made in completing their tasks. This is admittedly a coarse measure: It says nothing about why users fail or how well they perform the tasks they complete. Nonetheless, we like success rates because they are easy to collect and are a very telling statistic. After all, user success is the bottom line of usability.
Success rates are easy to measure with one major exception: How do we account for cases of partial success? If users can accomplish part of a task, but fail other parts, how should we score them?
Let's say, for example, that we ask users to order 12 yellow roses to be delivered to their friends on their birthdays. If a test user correctly makes the required arrangements, we can certainly score the task as a success. If the user fails to place any order, we can just as easily determine the task a failure.
But there are other possibilities as well. A user might order 12 yellow tulips or 24 yellow roses, fail to specify a shipping address or give the correct address but the wrong date, or forget to ask that a gift message be enclosed with the shipment. Each of these cases constitutes some degree of failure, so we could score it as such. However, we usually grant partial credit for a partially successful task. To us, it seems unreasonable to give a zero to both users who did nothing and those who successfully completed much of the task. How to score partial success depends on the magnitude of user error.
There is no firm rule for assigning credit for partial success. Partial scores are only estimates, but they still provide a more realistic impression of design quality than an absolute approach to success and failure.
So we have come a long way in just a decade. When will we see success rates of 100 percent? Probably never, because there will always be some sucky sites that almost nobody can use. But if current trends continue and sites invest more in usability, we should approximate 100 percent around 2015. Does this mean that the Web will be perfect by then? Certainly not. Success rates only measure whether it's possible for people to use Web sites, not whether it's pleasant or efficient to do so. Furthermore, because the Web is the ultimate competitive environment, once people can use almost all Web sites, they will still tend to use the ones that serve them best.
Web-Wide Success Rates
People succeeded 66 percent of the time when we took them to a homepage and gave them tasks that were possible to do on that site. But when we gave them a blank browser screen and told them to go anywhere they wanted to complete a task, the average success rate dropped to 60 percent. This makes sense because users first have to identify a site that will solve the problem and then use that site to accomplish the task.
If you are collecting usability measures for your own Web site, you should measure your numbers against the success rate we recorded for site-specific tasks, assuming that you too start your test participants on your homepage. This is the most common way to run usability studies because it maximizes the time users spend on the site that you are in charge of redesigning. If your users can perform 70 percent of reasonable and representative tasks on your site, you have above-average usability. Conversely, if their success rate is 50 percent, you have abominable usability and you will need to improve by about a third to bring your usability rates up to the average of 66 percent.
The 60 percent success rate we recorded for the Web-wide tasks is more representative of the overall Web user experience, when users are trying to do something new and they don't already know what site to go to. The lower success rates for Web-wide tasks is a measure of the difficulty of using the Web as a whole and the features that the Web provides to help users identify Web sites (mainly through search engines). So there's still plenty of room for improvement on the Web.
Success by Experience Level
We divided our test users into two groups, based on their amount of Internet experience. All had at least a year's experience using the Web, but there was still a broad range of expertise among them. For the purposes of this analysis, we divided them into "low-experience users" and "high-experience users," according to a variety of issues:
How many years they had been online
How many hours per week they used the Web, not counting time spent in email
How many "advanced" behaviors they exhibited, such as Web chatting, changing the labels on bookmarks, upgrading their browser, and designing their own Web pages
Whether they fixed problems with their computer equipment themselves
How much they followed current trends in technologyfor example, if they subscribed to computer magazines or were considered by friends to be a source for computer advice
In general people were considered "low experience" if they had been online for no more than three years, used the Web for less than ten hours per week, exhibited less than a third of the advanced behaviors, asked somebody else to fix their computer problems, and weren't consulted for advice on technology. Conversely people were scored as having "high experience" if they had been online for at least four years, used the Web for more than ten hours per week, exhibited more than a third of the advanced behaviors, fixed their own computer problems, and were a source for tech advice for others.
Of course, some people were advanced on some of the rating scales and less advanced on others. In those cases, their final designation as low or high experience depended on their average score.
As this table shows, the gap between the low- and high-experience users was 13 percent for the site-specific tasks and 15 percent for Web-wide tasks. In other words, experience was a stronger advantage when users had to navigate the entire Web instead of being told what site to use. This difference indicates that freedom of movement is more of an advantage for skilled users and more of an impediment for less skilled users. This again vindicates the "walled garden" approach (a closed environment that restricts user access to outside information) that AOL used in the early days when it tried to simplify the online experience for novice users.