Conversational AI for Consumer Voice Assistants

Native Voice

The Project

Native Voice was developing a suite of voice assistants to perform brand-specific tasks. One of those was a voice assistant to search and play National Public Radio audio content. I was the lead conversational designer for Native Voice, and responsible for conversational UX design for all of it’s products. I was also heavily involved in technical design, analysis and performance testing of the voice assistants.

The Work

This page illustrates the work I did on the NPR voice assistant. The work included:

  • Identifying requirements for a National Public Radio voice assistant (VA).

  • Defining use cases for the VA.

  • Overseeing user research on delivery of audio content.

  • Learning the NPR API’s to understand what audio content was available and how best to search and retrieve it.

  • Defining the full conversational UX for the VA.

  • Overseeing the development team responsible for implementing the VA.

  • Working with test engineers to analyze and improve the performance of the VA.

  • Tools included Amazon Lex, DialogFlow, Rasa, VoiceFlow, Miro, NLU/NLP machine learning, LLM prompt engineering

The Process

Create Product Vision

The first step in the project was to create the product vision through a “one-pager”, a one page mockup of the press release that might come out when the product was launched.

Conduct User Research

To understand user needs for the product we conducted a user study of existing NPR listeners to understand their listening habits.

Competitive Analysis

To understand the competitive market, we analyzed the capabilities of other voice assistants that provide access to NPR programming.

Technical Analysis

Before we could begin defining the capabilities of the voice assistant, we needed to create a high-level technical design. In addition, it was essential to understand what audio content was available and what methods could be used to search and retrieve specific content. This required a deep dive into the details of the NPR APIs.

Develop Use Cases

Based on the user data, competitive analysis and technical analysis, we created a set of use cases for the NPR voice assistant.

Define Product Capabilities

From the use cases we developed several different versions of potential product capabilities, from simple to complex.

Create Sample Dialogs

To illustrate the user experience, we wrote set of sample dialogs. Each dialog is an example of a conversation the user might have with the voice assistant. Sample dialogs provide a quick way to share the proposed product design with others.

Define MVP Requirements

The sets of product capabilities were refined into a final set of requirements for the MVP version of the product, along with a set of stretch (MVP+) requirements.

Building the Voice Assistant

Once the customer experience and product requirements were defined, we began the more technical process of building the voice assistant. Initial versions of the VA were built in Amazon Lex, but we eventually found that Lex was too limiting for our needs to switched to Rasa. We followed an agile methodology so were able to get a basic version of the VA up and running quickly, and the add capabilities with each sprint.

Tagging and Training

The NPR voice assistant was NLU based. We used the Google for ASR, the passed the recognized utterance into Rasa. To train the NLU in Rasa we needed to create and tag sample utterances and to define slots and slot values. All of the training data was tracked and summarized in a tagging guide.

Automated Testing

To assess the accuracy of the NLU we created an automated test tool that could feed thousands of test utterances into the voice assistant and evaluate the accuracy of the result. Based on the the results of this testing we updated the NLU training data, slot definitions and extraction pipelines until the NLU was performing to standards, with a goal of 80 - 90% accuracy.

User Testing

Once we were happy with the performance of the voice assistant, we began two types of user testing:

  • Internal “dogfood” testing. Dogfood test sessions were held, where company employees from all departments use the voice assistant for realistic tasks for about 30 minutes at a time. These sessions were held weekly, and after each session we would triage and fix all the bugs that were reported.

  • External “diary study” testing. A cohort of approximately 10 users were recruited to use the NPR VA on a regular basis over the course of several weeks. These users completed periodic feedback surveys and also recorded short videos giving their impressions of their experience with the voice assistant. The data from this study was compiled, analyzed and used to improve the customer experience.

App Demo

This video gives a short demo of the NPR voice assistant.