A couple of weeks ago, we received a request to convert our blogs to voice. The requestor pointed us at some free digital services that allow you to input text which then gets converted to voice.
In testing these, we discovered we all have differing views of what represents impressive artificial intelligence “AI”.
A couple of days later we discussed a direct mail, received from a competitor, advertising a seminar. We each identified three good and three bad things about the document and found our individual views were vastly different.
In this blog, I address how these two events set me thinking about information and the role of AI in how we consume it.
Text to voice
The audiobook, sometimes read by the author, is not new. On the other hand, having random text read, with meaning, by a bot achieving at near human equivalent performance is new.
We tested text-to-voice services using the first couple of paragraphs of our most recent blog. Initially, the majority of us were unimpressed.
However, this changed when we compared the service to our own performance reading the paragraphs. This provided a more objective view of human equivalency.
Try this test yourself by comparing a free service to your own performance – it’s much harder than expected!
Our conclusion was that although the delivery may have been a little too robotic, the accuracy in reading the words was at human equivalence.
Documents: substance or form?
Members of our team had very different approaches when considering the direct mail document.
These were either focused on the content, the information and messaging, or the format and layout (the look and feel) of the document.
Natural language processing
We use data capture software provided by ABBYY as part of our software solutions. This extracts data from documents significantly reducing data entry required in business processes.
Data capture technology reduces data entry with human equivalent performance in accuracy at considerably greater speed.
For those who prefer to consume information via voice rather than text data, the content could be copied and pasted into a voice-to-text service as above.
Capture technologies like ABBYY provide the digital text automatically from all forms of correspondence (post, email and other digital delivery). Using this solution in a mailroom using a workflow could provide a choice to consume content as text or voice (following conversation).
Points of resistance
Some of the resistance to going paperless in the early 2000s and through today has been based on the look and feel and the way the data is presented on paper versus digitally.
I still think the look and feel of a physical newspaper delivers a better experience than its digital equivalent. Generally, I consume the news digitally, far more efficiently than reading a conventional newspaper. I now regard reading a hard copy newspaper as a treat.
This is continuing in the current period of digital transformation. We often see this countered by presenting data digitally as though it were laid out on paper.
Judgements
Processes that strip out the data and deliver it as text, or text to voice, necessarily remove the look and feel of the document. This leaves the information presented to be evaluated based solely on its content.
In our work on voice above, we considered whether the content had to be customised for voice to get the message across. For example, a human reading the text of a blog to someone else could judge to omit subheadings and asides (items in brackets) that don’t really work when read out loud.
Natural language processing has come a long way. But I suspect the capability to make these judgements will be future developments rather than being available today.
Conclusion
AI offers us more diverse ways to consume and store information. However, there are compromises to be made when it comes to how the information is presented. Those early adopters of text-to-voice who are prepared to accept substance over form are likely to benefit the most.