Voice Interaction & Moe’s Southwest Grill: Burrito Quest
User research to determine how to design a voice user interface that fits into the Moe’s brand.
For this project, my team worked with Moe’s Southwest Grill, a national fast-casual restaurant chain, to determine how Moe’s could use voice interaction in their brand. This project had no design direction—our prompt was simply, “We are interested in voice interaction, but we don’t really know what that looks like.”
Using research, how can we determine how a brand like Moe’s can integrate novel technologies like voice interaction into their brand? What does that design even look like?
August 2018 - Dec 2018 (in progress)
User researcher, designer
Kelsie Belan, Rachel Feinberg, Suyash Thakare, myself
What I Did
With the team, I helped develop and analyze surveys, distill our findings into Jobs to be Done, create concepts and develop research methods to get feedback on our concepts, and I was also the primary designer of our prototype.
From there, I helped develop prototype evaluation materials, lead session demos for feedback, as well as heuristic evaluations.
What I Learned
Research, like design, is also iterative. With exploratory projects like this, sometimes we ask the wrong questions and that’s ok—it’s all about how you pick up from there, move forward, and pivot when necessary.
Designing a voice user interface presents itself a whole new world of questions that are completely new. Lean into existing research and methodologies to build upon a knowledge base; there is no need to reinvent the wheel.
Process – Initial Research
Our team was entirely naive about voice user interfaces: we took on this project because it was brand new to us, and we felt that naivety could help us hone our research skills. We started by thinking about how the Moe’s brand fits into voice interaction: well, people go to Moe’s to order food, and so, it seems like a logical step would be to look into voice ordering. To do that, we did sent out a survey using the Moe’s mailing list. We wanted to learn as much as we could about Moe’s, ordering food, and voice interaction, but we felt that would have made the survey too long. In the end, we sent half the list a survey about their experiences with Moe’s + how they order food and the other half of the list we sent a survey about their experiences with Moe’s + their usage with voice interaction. In the while people were responding to our surveys, we did some good ol’ desk research and competitive analyses and really dove into voice interactions.
In the end, we got back 25 surveys (ask me more about our survey design—it was a lengthy process!). From this, we distilled our findings into some primary implications:
Survey Findings (25 respondents)
Moe’s customers used voice interaction when it was efficient, quick, and generally preferred to use it in their homes
They liked how going to Moe’s because they could customize their experience and see all the choices
They loved saving money
No one had ordered food through a VUI, nor did they feel they wanted to!
We used Jobs to be Done to distill this further, as we felt that Personas weren’t really a good fit for this—we were looking more about intentions and tasks, and so we felt that Jobs to be Done would be a perfect framework for this. With our survey data, we were able to understand why people would go to Moe’s and what they want when they order food:
“When I am hungry, I want to be able to get my food quickly, but still be able to customize what goes in my food, so I can eat exactly what I want without waiting too long.”
But… we realized that we didn’t have the data that we were really looking for: when would people hire voice interaction? We knew when people didn’t hire voice: ordering food. But we were missing the crucial data about why people did use voice interaction. We hit a roadblock. What do we do now?
We were approaching the problem wrong. We needed to change the way we were thinking about our research and design process.
We were approaching it by looking at how Moe’s fits into voice interaction, which meant that we were only asking our questions oriented towards food (even our voice interaction questions).
What we needed to do was flip the script: we needed to look at how voice interaction fit into Moe’s. We needed to understand what was possible with VUIs and figure out if any of those fit into Moe’s. This is where our desk research and competitive analysis that we had done came in handy. We were able to re-orient ourselves and start asking the right questions: how do people currently use voice? What do they use voice to accomplish right now? Could interactive storytelling be a part of the Moe’s brand? What about games? What about ambient sounds (new Alexa Skill idea: it’s the sounds of a burrito being made in ASMR-style looped for 10 hours to help you sleep)?
Process – Re-orienting and Designing
With our new approach, we sent out another survey (due to speed of deployment and number of responses), but this time, we focused explicitly on peoples’ interactions with voice interaction, regardless of what it was. We were interested in finding exactly when and why people used voice interaction.
Survey Findings — Voice Interaction Focused (90 Respondents)
People used voice interaction when it was faster than doing it the “traditional method”
They also used it for entertainment
They really did not think voice ordering was a good idea
They used it when they needed to do something hands-free
They preferred to use it in their homes, if it wasn’t an extremely quick interaction
From there, we were able to distill it into empathy maps:
Using this data, we were able to start looking at the issue differently and create three concepts that fit within Moe’s brand identity and business goals:
a multi-modal voice ordering interface with both visual and voice interactions
an interactive voice trivia game,
a voice interactive storytelling experience.
***More information to come, but here’s a sneak peak: after four feedback sessions with different participants, we went with the voice interactive story concept and called it Burrito Quest. We created a prototype with voice recordings, music, mini-games, sound effects, cows mooing, and horses galloping. We tested the prototype in feedback sessions that resembled how users would interact with it in context, and we laughed a lot and had a whole buncha fun doing it. Next, we’re going to be doing usability evaluation with users, expert evaluations with voice heuristics, and iterating on this prototype.
This is a flow of the prototype I built for the testing sessions. It’s built in Apple Keynote, with voice recordings done by us. I added in sound clips, sound effects, music, and built the interactive mini-game. The test speaks to a bluetooth microphone device, and the prototype handler (often myself) plays the appropriate sound.
I also illustrated an accompanying map of the voice-interactive story—this came about from our research that people wanted some sort of visualization to go along with the interactive story. It also posed a question: what sorts of information should be on the voice system and what information should stay off (e.g. people were confused about the Moe’s points in the story, but should that be in accompanying materials or should it be in the system itself?