Klara: Voicemail AI

Project Overview

Klara enlisted help from an specialist from Alphabet for a year long project to help us develop tools using artificial intelligence. After working with the team to ideate and establish possible application for the technology we decided using AI to improve our voicemail tools was would have the greatest impact.

  • Research the characteristics of voicemails coming into Klara and how staff handle them
  • Ideation with team on potential AI applications that match our user's problems
  • Designed ways to display text extracted via Natural Language Processing Model and the ways we could leverage that to give our users better workflows

Existing experience

The voicemail playback appears in the conversation with the patient as a purple block with audio playback functionality. The only actions available are to listen to or download the audio file of the message.

The voicemail playback appears in the conversation with the patient as a purple block with audio playback functionality. Above is the state of this element before we introduced our changes.

Defining the problem

Handling voicemails can be an efficiency bottleneck for front-desk staff. Delays when loading the voicemail audio in the Klara app, time it takes to listen vs reading a text message, the potential need to listen to the message more than once... our research showed a number existing problems that, through ideation and research, we thought AI help solve.


Our Approach

After balancing impact for users, business priorities, and engineering capabilities and bandwidth our priorities for phase one were:

  • Use Natural Language Processing models to convert voicemail audio into text which has certainty score of above 80%
  • Add our default message type tags based on the transcription's content to allow us to apply automation like inbox routing directly to voicemails
  • Provide ability for staff to listen back to specific sections of the voicemail to clarify information, rather than returning to the start or using the playback timeline scrubber
  • Include mechanisms for staff to provide feedback on the accuracy of a transcriptions and ultimately help improve our model's accuracy

Design

The actual transcription and tagging of voicemails were technical solutions with little input needed from a design point of view. So it was the job of design to focus on the second half of the equation: improved experience for staff needing to double check voicemail content and allowing them to provide feedback.

Selecting Playback Point

The NLP model breaks the content of a voicemail down into pieces by identifying individual words. From there words can be assembled into sentences with more meaning.

For the first version, sentences were hard for the model to accurately define and our transcription pilot data told us that the majority of our messages were two sentences or less in length. So a working assumptions we formed was that users would find little

So we decided use individual words as playback triggers to start.

Highlighting & extracting content

Next we wanted to enable the system to highlight and utilise the most important parts of a conversation. So for a first version we highlighted the content we knew from previous research was impactful for staff making the majority of their decisions. For example patient name, date of birth, medication, etc. From there we came up with a path to using them in progressively more powerful ways.

1.The first level was automatically highlighting the useful information in the body of the original transcription

2. Next level was pulling out snippets of information into a standalone element which the most important data alongside a label for quicker scanning

3. Finally we wanted to test if we could accurately rewrite a message provide a short summary which would keep more context than just displaying data but would still cut reading time for staff

  • 23% reduction in number full voicemail plays by staff in Klara
  • 16% faster triage of voice messages of comparable length when using "Play from here" feature vs. basic transcription
  • 82% satisfaction amongst pilot customers when asked to rate effectiveness of the summary provided alongside the original voicemail transcription
- Measurements taken between 1st of January and May 2024