A voice-user interface (VUI) makes human interaction with computers possible through a voice/speech platform to initiate an automated service or process. A VUI is the interface to any speech application.
Here and now
Controlling a machine by simply talking to it was science fiction only a short time ago. Until recently, this area was considered artificial intelligence (A.I.). However, with advances in technology, VUIs have become more commonplace, and people are taking advantage of the value that these hands-free, eyes-free interfaces provide in many situations.
However, VUIs are not without their challenges. People have very little patience for a "machine that doesn't understand", therefore, there is little room for error: VUIs need to respond to input reliably, or they will be rejected and often ridiculed by their users.
Designing a good VUI requires interdisciplinary talents of computer science, linguistics and human factors psychology – all of which are skills that are expensive and hard to come by. Even with advanced development tools, constructing an effective VUI requires an in-depth understanding of both the tasks to be performed, as well as the target audience that will use the final system. The closer the VUI matches the user's mental model of the task, the easier it will be to use with little or no training, resulting in both higher efficiency and higher user satisfaction. In a nutshell, speech applications have to be carefully crafted for the specific business process that is being automated.
Pocket-size devices, such as e.g. mobile phones, still rely currently on small buttons for user input. Extensive button-pressing on devices with such small buttons can be tedious and inaccurate, so an easy-to-use, accurate, and reliable VUI would potentially be a major breakthrough in the ease of their use.
In the future
Nonetheless, such a VUI would also benefit users of laptop- and desktop-sized computers, as well, as it would solve numerous problems currently associated with keyboard and mouse use. Not to mention that keyboard use typically entails either sitting or standing stationary in front of the connected display; by contrast, a VUI would free the user to be far more mobile, as speech input eliminates the need to look at a keyboard. Anyway, such developments could literally change the face of current machines and have far-reaching implications on how users interact with them. Hand-held devices would be designed with larger, easier-to-view screens, as no keyboard would be required. Touch-screen devices would no longer need to split the display between content and an on-screen keyboard, thus providing full-screen viewing of the content. Laptop computers could essentially be cut in half in terms of size, as the keyboard half would be eliminated and all internal components would be integrated behind the display, effectively resulting in a simple tablet computer. Desktop computers would consist of a CPU and screen, saving desktop space otherwise occupied by the keyboard and eliminating sliding keyboard rests built under the desk's surface. Television remote controls and keypads on dozens of other devices, from microwave ovens to photocopiers, could also be eliminated.
Challenges to tackle
Numerous challenges would have to be overcome, however, for such developments to occur. First, the VUI would have to be sophisticated enough to distinguish between input, such as commands, and background conversation; otherwise, false input would be registered and the connected device would behave erratically. A standard prompt, such as the famous "Computer!" call by characters in science fiction TV shows and films such as Star Trek, could activate the VUI and prepare it to receive further input by the same speaker. Conceivably, the VUI could also include a human-like representation: a voice or even an on-screen character, for instance, that responds back and continues to communicate back and forth with the user in order to clarify the input received and ensure accuracy.
Second, the VUI would have to work in concert with highly sophisticated software to accurately process and retrieve information or carry out an action as per the particular user's preferences, therefore an accurate speech-recognition software and artificial intelligence on the part of the machine associated with the VUI, is needed.
Hands-free search, for instance, is about to become the new frontier for SEO. With the launch of Google Home to compete with Amazon’s Echo, connected home devices are now becoming mainstream. Unlike the type-and-read searches of the past couple decades, new connected home devices bring an audio component into the mix. To date, Google Voice searches are executed in a black box without the benefit of data or analytics for marketers to assess. In the upcoming year, marketers will need to begin tackling the idea of voice-driven search results, and take advantage of early analytics as they are released from platform owners.