A video of a participant going through our interactive story prototype.

Voice Interaction & Moe’s Southwest Grill: Burrito Quest

Designing an interactive voice Alexa Skill that fits into the Moe’s brand.


The Problem

Focus Brands (Moe’s Southwest Grill’s parent company), had tried to integrate novel technologies such as augmented reality to the Moe’s brand in the past. The AR project was launched, but unfortunately did not do as well as they had intended. This time, Moe’s is interested in Voice Technology, and wanted us to do a user-centered design process to explore how and what voice interaction technology could fit into the Moe’s brand to ensure it speaks to their customers.

“Voice interaction is a novel and increasingly popular technology, and Moe’s Southwest Grill wants to determine whether or not the technology will be a good fit for its brand.” 

— Focus Brands

For this project, my team worked with Moe’s Southwest Grill, a national fast-casual restaurant chain, to determine how Moe’s could use voice interaction in their brand. This project had no design direction—our prompt was simply, “We are interested in voice interaction, but we don’t really know what that looks like.”

Using research, how can we determine how a brand like Moe’s can integrate novel technologies like voice interaction into their brand? What does that design even look like?

Project Duration
August 2018 - Dec 2018

My Role
User researcher, designer

Kelsie Belan, Rachel Feinberg, Suyash Thakare, myself

Findings & Product

Through lots of user research, we found that customers simply didn’t feel that voice ordering was mature enough to use, and that the technology is a little too young for a complex food ordering structure company like Moe’s. Instead, Moe’s can leverage voice interaction technology to fit better in their marketing realm.

We created a working prototype of an choose-your-own-adventure voice-interactive Alexa skill, built based on the identity of Moe’s branding. This solution is a better fit for Moe’s than trying to create a voice ordering system.


What I Did

With the team, I helped develop and analyze surveys, distill our findings into Jobs to be Done, create concepts and develop research methods to get feedback on our concepts, and I was also the primary designer and builder of our prototype.

From there, I helped develop prototype evaluation materials, lead session demos for feedback, as well as heuristic evaluations.

What I Learned

Research, like design, is also iterative. With exploratory projects like this, sometimes we ask the wrong questions and that’s ok—it’s all about how you pick up from there, move forward, and pivot when necessary.

Designing a voice user interface presents itself a whole new world of questions that are completely new. Lean into existing research and methodologies to build upon a knowledge base; there is no need to reinvent the wheel.


Process – Initial Research

Our team was entirely naive about voice user interfaces: we took on this project because it was brand new to us, and we felt that naivety could help us hone our research skills. We started by thinking about how the Moe’s brand fits into voice interaction: well, people go to Moe’s to order food, and so, it seems like a logical step would be to look into voice ordering.

To do that, we did sent out a survey using the Moe’s mailing list. We wanted to learn as much as we could about Moe’s, ordering food, and voice interaction, but we felt that would have made the survey too long. In the end, we sent half the list a survey about their experiences with Moe’s + how they order food and the other half of the list we sent a survey about their experiences with Moe’s + their usage with voice interaction. While people were responding to our surveys, we did some good ol’ background research and competitive analyses and really dove into voice interactions. For our background research and competitive analyses, we looked at different voice ordering systems and broke them down using a task analyses, which helped us really understand the different steps needed for an interactive voice technology.

In the end, we got back 25 surveys (ask me more about our survey design—it was a lengthy process!). From this, we distilled our findings into some primary implications:

Survey Findings (25 respondents)

  • Moe’s customers used voice interaction when it was efficient, quick, and generally preferred to use it in their homes

  • They liked how going to Moe’s because they could customize their experience and see all the choices

  • They loved saving money

  • No one had ordered food through a VUI, nor did they feel they wanted to!

We used Jobs to be Done to distill this further, as we felt that Personas weren’t really a good fit for this—we were looking more about intentions and tasks and didn’t quite have the right information to build personas, and so we felt that Jobs to be Done would be a perfect framework for this. With our survey data, we were able to understand why people would go to Moe’s and what they want when they order food:

“When I am hungry, I want to be able to get my food quickly, but still be able to customize what goes in my food, so I can eat exactly what I want without waiting too long.”

But… we realized that we didn’t have the data that we were really looking for: when would people hire voice interaction? We knew when people didn’t hire voice: ordering food. But we were missing the crucial data about why people did use voice interaction.

We hit a roadblock. What do we do now?

We were approaching the problem wrong. We needed to change the way we were thinking about our research and design process.


We were approaching it by looking at how Moe’s fits into voice interaction, which meant that we were only asking our questions oriented towards food (even our voice interaction questions).

What we needed to do was flip the script: we needed to look at how voice interaction fit into Moe’s. We needed to understand what was possible with VUIs and figure out if any of those fit into Moe’s. This is where our background research and competitive analysis that we had done came in handy. We were able to re-orient ourselves and start asking the right questions: how do people currently use voice? What do they use voice to accomplish right now? Could interactive storytelling be a part of the Moe’s brand? What about games? What about ambient sounds (new Alexa Skill idea: it’s the sounds of a burrito being made in ASMR-style looped for 10 hours to help you sleep)?


Process – Re-orienting and Designing

With our new approach, we sent out another survey (due to speed of deployment and number of responses), but this time, we focused explicitly on peoples’ interactions with voice interaction, regardless of what it was. We were interested in finding exactly when and why people used voice interaction.

Survey Findings — Voice Interaction Focused (90 Respondents)

  • People used voice interaction when it was faster than doing it the “traditional method”

  • They also used it for entertainment

  • They really did not think voice ordering was a good idea

  • They used it when they needed to do something hands-free

  • They preferred to use it in their homes, if it wasn’t an extremely quick interaction

From there, we were able to distill it into empathy maps:

Empathy map of Moe’s customers and their experiences

Empathy map of Moe’s customers and their experiences

Empathy map of specifically those who used voice interaction

Empathy map of specifically those who used voice interaction

Using this data, we were able to start looking at the issue differently and together, my team and I created three concepts that fit within Moe’s brand identity and business goals.


Moe’s Voice Trivia

A voice trivia game Alexa Skill based on Moe’s branding as Musicians, Outlaws, and Entertainers.

Moe’s Interactive Story

An interactive choose-your-own-adventure story Alexa Skill that the user controls with their voice, developed with the Moe’s brand as a theme.

Multi-Modal Voice Ordering

An ordering system that has visual guides, but can also be entirely done with voice.

LO FI SKETCH : A sketch of the initial multi-modal concept used to get feedback

LO FI SKETCH: A sketch of the initial multi-modal concept used to get feedback

LO FI SKETCH : Because the trivia concept was voice only, the initial concepts were shared through a script that highlighted features of the concept. Voice only interactions we prototyped with lo-fi by creating scripts of interactions.

LO FI SKETCH: Because the trivia concept was voice only, the initial concepts were shared through a script that highlighted features of the concept. Voice only interactions we prototyped with lo-fi by creating scripts of interactions.

One of our concept review sessions

One of our concept review sessions

Concept Feedback

With our concepts developed, we then held four concept feedback sessions potential users. Overwhelmingly, users did not feel like the multi-modal ordering system was useful, but users ranked the interactive story adventure the highest, so we moved forward with that concept.


We then fleshed out our interactive story, now called Burrito Quest and worked together to figure out how to create the prototype.

To save time, our prototype gives the illusion of choice in the storyline. Though the user still has to make decisions that impact the story, their choices generally led to the same conclusion. This allowed us to test the voice interactions without building out an entire storyline.

As the primary designer of the prototype, I decided that Keynote was the best way to create it. We could embed audio files, and the prototype handler can move to different slides as needed.

I then recorded everyone’s voice acting lines, downloaded free sound clips as well as music, and then mixed everything together to create a believe, high-quality story to be used in our prototype.


This is a flow of the prototype I built for the evaluation sessions. The evaluator speaks to a bluetooth microphone device, and the prototype handler (often myself) plays the appropriate sound.

We held two evaluation sessions, and iterated on the prototype and evaluations in between each.

Evaluation — Usability Testing (4 participants)

For the first evaluation sessions, we wanted to understand if the users knew what to do, general usability issues, and if they felt the prototype was something they would use.

We got very positive feedback on the storyline, but also came out of the sessions with more questions: how do we make the integrations with Moe’s Rewards Points Clearer? Is the length of the story a usability issue with voice interaction or is it a personal preference issue? What about some sort of visual component?

People enjoyed the story a lot, and there was sometimes lots of laughter. This was a purposeful design choice for us — it had to be irreverent and fun to fit in with the Moe’s brand!

People enjoyed the story a lot, and there was sometimes lots of laughter. This was a purposeful design choice for us — it had to be irreverent and fun to fit in with the Moe’s brand!

Evaluation — Follow-up Usability Testing (5 participants)

For our next set of evaluation sessions, we iterated on the prototype by making changes to the narrative structure and introduced a visual component. The sessions were primarily used to answer the questions that we asked ourselves resulting from the previous evaluation sessions’ results.

For these sessions, I illustrated an accompanying visual as a direct response to feedback in our previous evaluation sessions.

MoesMapBurritoQuest (1).jpg

Evaluation — Heuristic Evaluation (3 participants)

We also recruited people experienced in designing voice interaction technologies to evaluate our prototype based on a set of heuristics that were created based on the Amazon’s Alexa Voice Design guidelines.

This was an interesting form of evaluation for us, because we designed a system that was meant for entertainment. The evaluators found that the system followed some heuristics very well, but that other heuristics (e.g. “Have the system be brief when speaking”) were violated a lot. However, interesting discussions came up in that those violations may not necessarily be a bad thing, because they actually could be enhancing the storytelling aspect of it.

From these evaluation sessions, we were able to answer the questions that previously came about from the last sessions. Narrative was personal preference, and not a voice usability issue. A visual component was extremely helpful and was useful for building the brand identity and having users engage with the story more. Rewards points were still confusing. Overall, our participants were extremely receptive of our prototype and our design.


Towards the end of the semester, we compiled our results and process, and a teammate and myself created a slide deck and presented it back to Moe’s Head of Marketing. Burrito Quest was extremely well-received by Moe’s, and they were very excited to see that we took a different direction with the voice-interaction question.

In the end, this was one of my favorite projects to work on. It was challenging, but we were able to design something really novel while staying within the parameters of the business’ needs as well as the users’ preferences.