AI Talk: AlphaFold, meeting monitor and emotion recognition

December 4th, 2020 / By V. “Juggy” Jagannathan, PhD

This week’s AI Talk…

AlphaFold – Solution to a 50-year-old grand challenge

DeepMind, a subsidiary of Google, has released some groundbreaking research! University of Maryland Professor John Moult, who created the biennial challenge Critical Assessment of protein Structure Prediction (CASP) in 1994, had this to say: “We have been stuck on this one problem – how do proteins fold up – for nearly 50 years. To see DeepMind produce a solution for this…, is a very special moment.” High praise indeed.

What is a protein folding problem? Well, there are 20 amino acids that make up a protein. Proteins are literally the building blocks of life. How these amino acids come together to create little engines that power us is the protein folding problem. Understanding a protein’s structure is the key to understanding what they do and how they do it.

Only a small fraction of 180 million known protein sequences have their structure decoded. The process for decoding is a painstaking experimental one. 170,000 decoded protein sequences were used to train a deep neural model created by DeepMind. Their predictions were blind reviewed by CASP organizers for correctness. DeepMind correctly predicted the SARS-CoV-2 virus structure as well – its structure confirmed by experimentation. This effort is being hailed as a once in a generation advancement in biology! Its implications and potential are sky high. Perhaps 2020 may go down as a year when science triumphed?

AI Meeting Monitor

As we spend more and more time on video conferences, a spate of companies  are trying to figure out how to make these interactions more productive. I saw this article in Wired which highlights all the different efforts on this front. A startup called Headroom, aspires to take notes and create meeting minutes. It also keeps tabs on how long each participant talks. Not only that, it also uses AI to determine the emotions of the participants  (Are they engaged or bored?) and provides real time feedback to the speaker. Headroom is possibly swimming against the current when it comes to emotion recognition – an area fraught with biases and miscues (see the next story in this blog).

Another startup, Clockwise, is trying to ensure that you have enough time to do your actual work instead of being constantly pulled into meetings. How do they do that? They automatically block times in your calendar to guarantee you have blocks of time to focus on real work and optimize your meeting times.

The Wired article lists a slew of other startups that provide various tools, from creating interactive backgrounds to creating transcripts to providing a separate screen to view conference participants. Video conferences have become ubiquitous at work and it is likely they will stick around even when life returns to “normal.” Tools and technology that assist in making them productive will certainly have a role to play.

Emotional Recognition

While doing some research for the article above, I came across an article focused on emotion recognition software. It’s interesting to note the various applications that can be addressed by emotion expression classification. Are you bored? Are you attentive? Just being able to identify these emotions can be useful, not just in meetings as discussed above, but also when you are driving. Autism therapy is another application area; deepfake detection is another. Market research sentiment analysis can also benefit from this technology. The one application that took me by surprise is the concept of “pay-per-laugh.” This is an experiment happening at a comedy club in Spain, where a camera monitors your face and records how many times you laugh. The amount you pay for the show is tied to the detection of that emotion! What will they think of next?

A related acronym here is FER (facial expression recognition). I was surprised to see that currently the market for this type of software is $25 billion, slated to double in just three years!  All the above applications notwithstanding, this field is beset with a number of challenges. Detection of any expression is complicated by race, age and gender. For instance, age related wrinkles can be mistaken for smile! Privacy concerns are another consideration. In spite of all of these, emotional recognition will continue to evolve and play an important role in our lives. I, for one, would like to be woken up if I happen to snooze or get distracted by my ever-present smartphone while driving. Better still, if I can sit back and let my car drive itself!

I am always looking for feedback and if you would like me to cover a story, please let me know. “See something, say something!” Leave me a comment below or ask a question on my blogger profile page.

V. “Juggy” Jagannathan, PhD, is Director of Research for 3M M*Modal and is an AI Evangelist with four decades of experience in AI and Computer Science research.