Deep learning and clinical natural language processing (NLP)

September 8th, 2017 / By Richard Wolniewicz

The first week of August saw the 55th annual meeting of the Association for Computational Linguistics (ACL) in Vancouver, Canada. This conference is the premier global NLP conference, demonstrating the state-of-the-art for NLP. For clinical NLP, I think of ACL as the Computer Science counterpart to the AMIA Annual Symposium, where similar topics are addressed from a medical informatics perspective.

Healthcare applications are growing as an area of interest (though alas, medicine is well behind social media for overall focus). The BioNLP workshop specifically targets biomedical applications of NLP, and this year researchers from Microsoft and Google provided a tutorial on using NLP for precision medicine.

The overall message was clear: Deep learning is coming to dominate NLP as it has other areas of artificial intelligence. Deep learning is the modern application of earlier neural network technology, which was prominent in the 1980s-90s, with “deep” layers (typically six or more) made possible with modern computing technology. Neural networks are machine learning algorithms inspired by the structure of the brain – though they are not in any way an attempt to model the brain directly.

Almost every piece of research presented at ACL touched on neural NLP—the phrase you’ll hear as a rough approximation of “NLP using deep learning.” There were many exciting advances in NLP science, and here I will focus on two which I believe are particularly relevant to health care.

First, many discoveries enable incorporating domain expertise into neural NLP models. Historically, statistical machine learning (ML) models have often been seen as conflicting with expert-driven rules approaches to AI, and for a long time deep learning was like other ML in this regard. Health care has vast amounts of expert content—ontologies, coding systems, dictionaries, etc.—which has made ML harder to apply compared to expert systems. New techniques are succeeding at encoding expert knowledge into neural NLP models, such as (a) enforcing ontologies onto word embeddings, (b) generating training data from healthcare knowledge resources, and (c) structuring neural network architectures to capture clinical knowledge.

Second, the ability to transfer knowledge from one problem to another is accelerating rapidly. Historically, ML systems were trained for a specific task, and had to be retrained from scratch for a different task. Now there are more ways to re-use parts of a trained neural NLP model in another model doing a different task. For instance, pieces of a neural NLP model trained to code English clinical documents with ICD-10-CM could be re-used in another model trained to code German clinical documents with ICD-10-GM. Even better, the German model would benefit from the English data it was trained on, and visa-versa, meaning both systems end up better than either alone.

The application of deep learning technology to clinical NLP is still in its infancy, and I expect we will see many exciting new discoveries here in the next few years. It will definitely be interesting to see where deep learning shows up in the research presented at AMIA this coming November.

Richard Wolniewicz is division scientist, NLP Advanced Technology for 3M Health Information Systems.

What are the elements of successful NLP?