With six current users of VUI devices we carried out one-to-one contextual interviews in their homes where they could demonstrate to us the tasks they typically performed. We also conducted six interviews and usability tests with non-users in our UX lab at the Orange Bus office in Newcastle.
Along this journey, Holly and I regularly discussed the challenges we faced when conducting research with this technology, and how our UX methods may need to be adapted. Here, I’ll share with you what we learned and what we’d recommend when conducting research with voice devices. Given the lack of first-hand accounts of UX research into this technology, we intend for this to support and guide researchers as they explore the ever-evolving world of VUI.
HOW TO: Introduce a usability task for a VUI device
Usability testing is an observational method with some interviewing elements. It’s a semi-controlled scenario where participants attempt to use a digital device, product or service to complete a particular task.
A researcher facilitates the session and identifies any usability issues. The researcher introduces a task to the participant such as: “You want to book a return train ticket for tomorrow morning, how might you go about that?”, and then observe how they carry out that task. It’s important to ensure that the task wording is broad and non-leading, so that the researcher doesn’t inadvertently provide hints about how to complete the task.
The researcher will keep conversation to a minimum during the task, to avoid distracting or leading the participant. However, they can use prompts to elicit more information, such as: “You seemed to hesitate, can you tell me what you were thinking?” or “What did you expect to happen when you selected that?”.
VUI is hands and, in most cases, screen-free relying on the user being able to speak commands and hear the device’s response. This poses unique challenges for usability testing, because the mode used to interact with the device is the same as that used to conduct the research - conversation. This means that when introducing a usability task for a voice device, the researcher has to be especially careful that they don’t provide the participant with possible command phrases.
For instance, it could be leading to say: “How might you ask Alexa what the weather will be tomorrow in Newcastle?” as the user would likely reflect the researcher’s language and say “Alexa, what will the weather be tomorrow in Newcastle?”
Instead, introduce the tasks in a less conversational way, and add more context and detail to the task in order to disguise any potential command wording, like: “Imagine you are trying to arrange a bike ride for the weekend with some friends and want to check the weather - how might you do that?” This means that the participant has to think a little more about the goal of the task, and choose the wording they would use to achieve that goal using a voice device.
HOW TO: Get feedback from participants on their experience of using a VUI device
Another difference between testing with VUI devices versus screen-based devices is that it’s not possible to use the ‘think aloud’ method when conducting the testing as this would interrupt the participant’s interaction with the device.
The think aloud method requires the participant to narrate their experience as they are attempting to complete an action online. A participant using a website might tell the researcher things like: “I’m looking for the search box now because I can’t find what I’m looking for on this page.” The researcher can prompt the participant to give more information, by saying things like: “Tell me why you did that?”, “What would you do next?” or “I noticed you sighed and pulled a face, why was that?”
In usability testing, the researcher needs to strike a fine balance between gaining feedback as a participant moves through an activity, while not distracting them and interfering with their progress. It’s also very important that the researcher doesn’t lead the participants age by accidentally providing hints or making leading comments.
When testing VUI, the participant isn’t able to talk to the researcher while carrying out the task as the device may pick up on what they say and attempt to decipher a command from it. The task would likely fail as the Amazon Alexa responds to the conversation saying something like: “I’m sorry, I don’t know that one.”
If the participant has an extended interaction with the device (repeated command-response exchanges) the researcher can wait until the interaction is over and the device is no longer ‘listening’ before asking them about what happened. If the device stops listening when the participant has not finished the task/interaction (Alexa’s light goes off) the researcher can use that as an opportunity to ask for quick feedback on the experience the user is having, asking things like: “What were you expecting to happen?” “What would you do next?” Again, there always needs to be a balance struck between obtaining feedback and letting the interaction unfold as naturally as possible.
A potential solution to this could also be to play back a video recording of the interaction immediately afterwards to remind the user what happened and gather feedback retrospectively. This may help to generate further discussion around their interaction with the device. If using this technique, the session needs to be set up so that the recording can be quickly and easily replayed.
HOW TO: Adapt to the shorter user journeys typical for VUI tasks
With current VUI devices, the tasks that users can carry out are generally quite short - such as asking Alexa to turn a light on or off or asking Siri to create a reminder. These journeys also have a narrower range of possible journeys and outcomes: success, failure, or an attempt to recover from an error.
We found that the most lengthy and complex interactions with VUI devices during our study were when users attempted to recover from errors, which could involve multiple attempts to reword a command in an attempt to get the device to understand their request.
By contrast, when testing a website, users can carry out tasks that have many more steps and many more points of success or failure. For that reason, there is comparatively less opportunity for detailed feedback with VUI interactions. We can explore:
- If an interaction met their expectations
- Their thoughts and feelings
- If there’s anything they would change
- What features of the interaction / device they did and didn’t notice
However, because the interactions are shorter there is simply less activity to gather feedback on and this alters the kind of data you end up with and the focus of your recommendations.
Many of the recommendations that came out of this discovery research were to improve error messaging so that users were better supported in their attempt to recover from errors and achieve their original goal. Because of the limited amount and types of feedback you can gather through usability testing VUI, we would recommend supplementing this method with other methods. We found the diary study a particularly useful method for our discovery project as it gave more detail around the context of use, such as the user’s motivation to use the device, the time of day or what room they were in - as opposed to gathering data in a controlled scenario.
HOW TO: Gather data on VUI using a diary study
The interviews and usability tests revealed some very interesting insights, but were limited in their ability to reveal the everyday context in which VUI devices are used. Although this is also the case when usability testing a website or app, it felt even more of a limitation when testing VUI technology, where the aim is to seamlessly integrate into people’s everyday lives and routines. Because of this, understanding how VUI functions in these environments and their changing conditions, such as background sound, other people moving in and out of the room as a device is being used, talking to a device in a public setting where strangers could hear or observe, is even more significant.
Conducting a short diary study with current users assisted us in gathering more of this contextual information. Over a 3-day period, we asked the 6 participants to document:
- What they used their device for
- When they used the device
- Where they used it
- If the interaction was successful or not
- Their feelings on each occasion
- Anything they would like to change
We also used this information to inform our questions and the use cases we asked them to demonstrate to us during the contextual interview.
Having discovered the value of the diary study in our discovery research we felt the method could be used even more extensively, over a longer period, gathering additional information, with a larger sample of participants, to generate richer insight into the context and routines in which VUI is being used.
HOW TO: Use the participant’s language
One area we were particularly interested in exploring with this research was how participants perceived and related to voice devices. We wanted to know if participants personified the devices, in what way, and how that affected their experience of the interaction. A lot of users referred to the device as ‘she’ or ‘her’ and used human characteristics to describe it, such as ‘friendly’ or ‘sanctimonious’. This appeared to affect their interactions with the device and the feelings this elicited. For example, one participant, after repeatedly getting the error response “I’m sorry, I don’t know that one” from Alexa, looked visibly disappointed and explained that it felt like “a bit of rejection”.
In order to gather this insight, we had to be careful not to impose our own perceptions of the device on the conversation. Both myself and Holly instinctively personify voice devices and relate to them in quite a personal and playful way. In order to discover what users did, however, we had to be mindful of the language we used, introducing the device in a neutral way by referring to it as ‘it’ or ‘the device’ and then mirroring the language used by the participant: if they called it ‘her’, we would call it ‘her’ etc. However, some participants will switch between referring to the device as ‘he’/’she’ and ‘it’, so it’s important to recognise that complexity to their perceptions of the voice device and be mindful not to lead the participant.
HOW TO: Cope with VUI being accidentally activated
This is a minor issue, and may seem obvious, but one to be prepared for… the device will be activated in the research session when you use the activation word in conversation. For example, if you ask the participant “What made you decide to start using Alexa?”, the command word ‘Alexa’ will trigger the device to begin listening and try to respond based on what it hears. When this happened during our research, it sometimes prompted the participant to mention other occasions when their device did that (such as being triggered by the radio or TV or when someone mentioned a friend called ‘Alex’) and how they felt about it. However, it does have the potential to disrupt the session and distract the participant if it happens repeatedly. Strategies to get round this are to agree a ‘code name’ with the participant in advance such as ‘A’, refer to the device as ‘it’ or ‘the device/your device’, and to use the nomenclature given by the participant whether that is he/she.
As this technology evolves and its use cases increase and grow in complexity, our research and design methods also need to adapt and evolve to match it. We hope these tips and learnings help you on your own VUI research journey.
Orange Bus is leading the way in the research and design of voice interaction and is publishing an ongoing series of blogs posts on the theme so watch this space for more findings and tips!
Do you want to find out how Voice User Interface and user interaction can impact the way you connect with your customers? Contact us on 0191 241 3703, or email firstname.lastname@example.org to find out more.