From 3M Health Information Systems
AI Talk: Dictation, deepfakes and tennis avatars
Speech to Text
The Economist had a recent article on how dictation can be used to write text. The article makes some good points. Anyone who has actually tried to do this knows that it’s pretty difficult to dictate an article. The only people I know who do this successfully are physicians, for whom our company provides dictation software. The reason why dictation works for them is because they are trained to use voice to dictate clinical reports. These clinical reports have significant structure and can quickly capture catch phrases and terms that physicians use frequently, making the process more efficient. After specialized training, physicians can dictate fairly fluently, but if the average person is trying to dictate a news article, it’s not that easy.
In fact, I made an attempt to write this piece about speech to text using Microsoft Word speech recognition software! Of course, it had lots of errors and I had to go back and correct them. The errors are not necessarily the fault of the recognition software, but my disfluency in dictating. The truth of the matter is, I cannot think faster than I can type! Maybe it is because my thinking has slowed down to the speed at which I can type. Whatever the reason, as the author of the article comments, I am in no hurry to adopt speech to text. Asking Alexa for something is a different story!
Deepfakes: The coming infocalypse
I came across a book written on this subject by Nina Schick over Labor Day weekend. Since this topic has been in the news for the past few years, I decided to read the book. It turned out to be an interesting read (or listen as I did). I didn’t find out anything particularly new from a technology standpoint—however, all the points the author makes are relevant. Deepfakes are essentially media (video and audio), created using AI that are practically indistinguishable from real content. She distinguishes between deepfake and synthetic media. Both are realistic media of virtual content, however, she reserves the word deepfakes for content generated with malicious intent.
It’s also not surprising to note that the majority of deepfakes are focused in the porn industry, but its application where it can propagate disinformation and misleading information is of grave concern, particularly during a contentious election. She dubs it the coming “infocalypse”—perhaps a bit melodramatic, but probably not to be trifled with either. Her advice at the end of the book is good: “Be careful about what information you share, verify your sources, correct yourself when you get something wrong, be wary of your own political biases, be skeptical but not cynical.” She also points to a few fact checking sites: AFP Fact Check, Full Fact, Politifact, and Snopes.
Last weekend was the conclusion of the U.S. Open—a major tennis tournament. In spite of the ongoing pandemic, the organizers managed to pull together a spectacular two weeks of scintillating tennis and its attendant controversies. This one had the leading contender for the throne, Novak Djokovic, disqualified for accidentally hitting a line judge with a ball. Nevertheless, the final between Thiem and Zverev was a marathon 5-set affair with Thiem surviving to win.
This was the first time in a number of years when the triumvirate Federer, Djokovic or Nadal weren’t in the picture! Federer and Nadal skipped this event for various reasons, but they can take heart now. Even if they are not able to play physically, they can deputize their avatars to play in a virtual tournament! Stanford University has come up with a way to create synthesized media (see fake from above) that is able to string together video clips of tennis shots to simulate real match play. You can see the realistic clips that they generate. Their ingenious solution shows how Federer can play against himself, or Djokovic, Nadal or even Serena Williams! Perhaps, in these strange times, it is quite appropriate to consider having a virtual tournament with tennis avatars!
The Economist article was recommended to me by my colleague and friend, Philippe Truche. The tennis avatar article was forwarded to me by my friend Chris Scott of Google.
I am always looking for feedback and if you would like me to cover a story, please let me know. “See something, say something!” Leave me a comment below or ask a question on my blogger profile page.
V. “Juggy” Jagannathan, PhD, is Director of Research for 3M M*Modal and is an AI Evangelist with four decades of experience in AI and Computer Science research.