Determining the Right Sample Size for Experience Design Evaluation

Finding the right sample size is a tradeoff between the number of participants in the study and the ability to detect problems. The larger the sample size, the more problems that get uncovered. There is however a diminishing return as fewer new problems get uncovered with each additional user.

And not all problems uniformly affect all people. People bring varied experiences, expectations and knowledge as they attempt to accomplish goals. You can limit this variance to focus on your target audience using personas.

When developing your evaluation plan, think in terms of the percentage of people particular areas of your design will affect. The fewer people that an area of your design impacts, the larger the sample size you will need to have a good chance of finding it in an evaluation.  The good news is the problems that affect most people require a smaller sample size to detect.

The best way to make a sample size decision according to Jeff Sauro is to:

  1. Pick the minimum problem percentage you want to detect. For example you want to uncover problems that impact at least 10% of people.
  2. Identify how sure you want to be (chance of detection) to see these issues in an evaluation. For example 85% likely.
  3. Use the binomial probability formula to determine the sample size needed based on the chance of seeing the problem and its occurrence.  The formula is simple: log(1-Chance of Detecting) / log(1-Probability of Occurring).

For example, if you want to identify problems than impact 10% or more of your customers and you want to have an 85% chance of seeing them (if they exist) in a usability test, then you need plan on observing 18 users : log(1-.85) / log(1-.10) = 18.006.

If you want to see just more obvious problems that affect 1/3rd or more of users and have the same chance of seeing them you need to test 5 users: log(1-.85) / log(1-.33) = 4.73.

If you want to see those less obvious problems (1%) then you should have a lot of time and budget to observe 189 users: log(1-.85) / log(1-.01) = 188.72.

It is a tradeoff to identify as many problems as possible while still working within a budget.

I recommend an iterative design approach with running several small tests periodically through the design lifecycle. After the first study with 5 users has found 85% of the problems, you will want to fix these problems in a redesign.

After creating the new design, you need to test again. A second test will discover whether the fixes worked and if other problems were created. The second test with 5 users will discover most of the remaining 15% of the original usability problems that were not found in the first test. (There will still be 2% of the original problems left — they will have to wait until the third test to be identified.)

The second test will always lead to a new (but smaller) list of problems to fix in a redesign. And the same insight applies to this redesign: not all the fixes will work; some deeper issues will be uncovered after cleaning up the design. Thus, a third test is needed as well.

Several small tests with your target audience throughout your design lifecycle is your best bet for developing the optimal experience design.