AI Talk: Monster models, learning machines, mining EHR data

October 18th, 2019 / By V. “Juggy” Jagannathan, PhD

This week’s AI Talk…

Bigger is better?

The trend line is clear: First we had a language representation model named after a beloved muppet character, Elmo. This language model had 94 million parameters and did very well on a range of natural language understanding tasks. Then OpenAI released a model named GPT with 110 million parameters. Then came another muppet named BERT that sprang out of Google’s AI division at a monstrous size of 340 million trainable weights. Then GPT-2 came along at a mere 1.5 billion and Nvidia decided to go all out with a 8.3 billion parameter model dubbed MegatronLM. The models were doing progressively better in various benchmark tasks, but also were practically unusable and essentially not deployable. To try and restart a trend toward smaller more useful models. Hugging Face, an open NLP company, has released a new model called DistilBERT. At 66 million parameters, it performs at 97 percent of the original large BERT model. The secret to their approach is to use the large BERT model to train the smaller model: a teacher training a student! Interesting approach. They have released a paper and open sourced their models, as well. The hope is for these types of models to improve the natural language understanding skills of virtual assistants and other smartphone-based apps!

The Ultimate Learning Machines

I saw a fascinating essay this week in the Wall Street Journal written by Alison Gopnik. She is a Professor of Psychology at the University of California, Berkeley. Last week I reviewed a book on rebooting AI, which described current approaches to AI and how they do not capture common sense. This is a good follow-up article to review! Professor Gopnik has made it her life’s mission to figure out how babies think and learn. Take a look at this article she wrote almost a decade ago in Scientific American. Professor Gopnik has been tapped by the famed Defense Advanced Research Projects Agency (DARPA), working with AI researchers to try and build computers that can learn and think like babies! The program appropriately enough is called MESS: Model-Building, Exploratory, Social Learning System. This is definitely the next frontier for AI!

Mining EHR data

Health Data Management carried an article this week about research conducted by my alma mater, Vanderbilt University. This one is about mining EHR data. There have been a lot of studies which use EHR data to predict one thing or another. Google crunched 40 billion data elements last year to predict readmission and mortality. But this study is different. It was not trying to predict anything—it was an effort to discover new knowledge! The goal of the study is to figure out if one can determine which factors influence the progression of cardiovascular disease (CVD) and myocardial infarction (MI). They tried a big data approach using unsupervised techniques. The cohort they used was 10 years of de-identified data from 12,380 adult CVD patients. For this patient population they assembled all the billing codes over a 10-year period. They identified 1,068 unique ICD-9 codes (phenotypes) for this population. They then constructed a cube 12,380 (patients) x 1068 (presence/absence of specific ICD-9 code) x 10 years. Then they used a non-negative factorization (an unsupervised machine learning technique—loosely analogous to clustering of comorbid variables) of this cube of data to three matrices which essentially gave a window into the progression of factors (sub-phenotypes) over the years—showing correlation hitherto not known to be associated with CVD disease progression. Their findings included evidence that Vitamin D deficiency, depression and urinary infections are factors associated with CVD—and the current risk prediction tools do not take these into account. As we move towards precision medicine, undoubtedly analysis such as this will play a big role in unearthing the hidden factors buried in all that EHR data!

I am always looking for feedback and if you would like me to cover a story, please let me know. “See something, say something”! Leave me a comment below or ask a question on my blogger profile page.

V. “Juggy” Jagannathan, PhD, is Director of Research for 3M M*Modal and is an AI Evangelist with four decades of experience in AI and Computer Science research.