What is experimental evaluation in HCI?

It is important to evaluate the user interface's design before it is finalized. The techniques to perform evaluation can be broadly divided into two categories: evaluations that require expert analysis and evaluations involving user participation. Experimental evaluation is a practical Based on observation rather than theory.evaluation technique that involves users. We'll discuss it in detail in this answer.

Experimental evaluation

Experiments are a valuable tool for evaluating an interface design. Evaluators hold experiments in a controlled environment by changing only some attributes and analyzing their effect on the users.

Factors in experimental evaluation

The basic form of an experiment holds the following factors:

  • Participants

  • Variables

  • Hypotheses

  • Experimental design

  • Statistical measures

These factors must be chosen carefully to conclude reliable results. These are explained below.

Participants

The participants in an experiment are the potential and actual users of the interface. It should be kept in mind that the participants chosen are representative of the target users. This means that the participants should have similar qualities as the users, including age, education, level of experience, etc. The sample size should also be large enough to conclude accurate statistical results.

Variables

Variables are the attributes that can be measured and evaluated. They are of two types: independent and dependent. Independent variables are those whose values are changed to test different conditions. Dependent variables are those affected by changes in the independent variable(s).

For example, suppose an experiment tests whether the number of errors increases with addition to the content on a screen. Here, the independent variable is the content present on the screen, and the dependent variable is the number of errors made by users.

Hypotheses

A hypothesis is a predictive statement expressed in terms of the independent and dependent variables. The experimenter tests whether this hypothesis is true or false. For example, the experiment mentioned above checks the hypothesis that the number of errors increases with addition to the content on a screen.

Experimental design

The design of an experiment can be either between-subjects (randomized) or within-subjects (repeated measures). In a between-subjects design, each participant is assigned to only one condition of the independent variable. Whereas in a within-subjects design, every participant performs under every condition.

Statistical measures

The data collected from the experiment is recorded, and several statistical measures are applied. The hypothesis is then tested using this statistical data, and the results are concluded.

Note: Learn more about hypothesis testing in this answer.

Example

Suppose we want to decide whether to use icons with or without labels in our interface design. To experiment with this, the first thing we need to do is to come up with a hypothesis. We can assume that there will be fewer errors if labels are used as there will be more clarity. We can thus form the hypothesis as follows:

Note: There will be fewer errors if icons with labels are used.

From this hypothesis, we can conclude that the independent variable is the usage of icons with or without labels, and the dependent variable is the number of user mistakes while selecting an icon.

To conduct a controlled experiment, we must ensure that the interface is similar in every way except the usage of icons, which will be changed for each group.

The independent variable will be changed in the two interfaces

Next, we recruit participants according to the characteristics of potential users of our interface and divide them into two groups for a between-subjects experimental design. They're made to use the interface design, and the number of mistakes for each user is recorded. This data is then statistically evaluated using appropriate statistical measures. If the average number of errors for icons with labels is more significant than those without labels, we conclude that the hypothesis is true.

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved