ICD-10-CM/PCS MS-DRG Grouper Part 1

May 23rd, 2011 / By Ron Mills, PhD

During the first week of May, Anne Boucher and I gave a presentation at the WEDI conference in Seattle, featuring the construction and testing of the MS-DRG grouper and the financial impact of the switch from ICD-9 to ICD-10 on MS-DRG mediated hospital reimbursement. Liz McCullough and I had given roughly the same presentation at the CMS C&M meeting in September 2010; Liz repeated it at the AHIMA ICD-10 Summit in April 2011, and I’m giving it again (it gets better every time) at the AHIMA Convention in Salt Lake City this coming October.

We obviously think this is important stuff. The full text of the original is buried in the middle of the September 2010 C&M meeting handout, which you can find at http://www.cms.gov/ICD10/11b_2011_ICD10PCS.asp. For those of you interested only in the bottom line, it is this: on average, the financial impact to IPPS from the switch to ICD-10 is predicted by our modeling to be about a nickel more per every $100 of inpatient reimbursement – practically speaking: revenue neutral.

Of course, you-know-who is in the details. I’m going to use this and my next few turns in this space to talk about some of them, especially some which we didn’t have time in the presentation to reveal. The MS-DRG grouper is not only an important cog in the U.S. health care reimbursement machinery – it is also representative of any complex ICD-9-based application that has to be working with ICD-10 data by October 2013. Today, however, we’ll look at something unique to the grouper: its architecture.

First, DRG Groupers 101, for those of you who are new to the subject. A “grouper” is a piece of software which takes as input a patient’s diagnoses and procedures as coded by medical record coders for an inpatient stay, along with the patient’s sex, age, discharge status and sometimes other data like birth weight for babies. Its principal output is a number from 001 through 999 called the DRG. (In 1969 when Professor John Thompson took the ubiquitous unlit cigar from his mouth and said “We’re going to call them Diagnosis Related Groups”, I said “What a dumb name – they’re related to much more than diagnoses.”) DRGs are a classification of inpatient stays which are, as much as possible, medically meaningful and statistically predictive of resource utilization. Since that first one (the “lost” version 1 – built with 100,000 records coded in ICD-7 from Yale-New Haven Hospital), groupers have evolved considerably. The most sophisticated (for example, 3M™ APR-DRG™) now provide many additional outputs like severity of illness, risk of mortality, and tons of flags telling you how the various inputs were used to get your results.

After ICD-9-CM came out in 1974, I had the pleasure of working with Rich Averill and Enes Elia to architect what was then the HCFA grouper (version 2), later CMS, and now MS-DRG. MS-DRGv29 (probably the last in ICD-9) will be released in October of this year – a run of nearly 30 years – “old age” indeed for software. Back then, most computers were room-sized “mainframes” with (literally) one-millionth the processing power of your cell phone. The architecture we came up with necessarily had to exploit the special properties of both mainframes and ICD-9-CM to achieve both speed and a small “footprint” – memory usage measured in not many “K”.

The high-level structure of the MS-DRG grouper was then (and still is) composed of three steps:

  1. Look up each of the patient’s diagnosis and procedure codes in a table which gives a list of “attributes” for each code. For example, a diagnosis may be a Major Complication/Comorbidity (MCC), or a Complication/Comorbidity (CC) or neither. A procedure could be O.R. or non-O.R. MS-DRGs have 472 such attributes. APR-DRG has nearly 4,000.
  2. Combine the code-level attributes to create the attributes for the entire inpatient stay.
  3. Apply logic rules to the attributes to determine the DRG.

With ICD-9, the first step exploited the special hierarchical and numerical structure of ICD-9-CM codes to find each code and load its attributes, expressed as a sequence of individual bits. ICD-10 codes are no longer merely numeric, nor are the procedure codes hierarchical, so we must use more conventional techniques – what computer scientists call B+ trees. However, we can take advantage of the fact that there are only about 1,600 different patterns of attributes spread across 140,000 codes, and that some decisions are more efficiently based on code “clusters” than on individual codes.

The second step, combining the attributes, can be done, as before, with the Boolean operators computers love so well.

The logic rules were expressed in the ICD-9 groupers as a table of attribute “masks” – variously called the DRG table, the logic table or the hierarchy table. While this approach was super space-efficient on the old mainframes, it sacrificed speed, readability and the option of embedding other useful information (like how best to represent the logic in the DRG Definitions Manual). In the new ICD-10 MS-DRG grouper, the logic is expressed as a set of IF-THEN rules based on the attributes. CMS and its contractors enter the grouper specifications as they always have, but now (insert Twilight Zone theme music here) a computer program writes a computer program to do the decision making.

Though it has to handle eight times as many codes as the ICD-9 version, the ICD-10 MS-DRGv28 grouper used in the study cited above has tables only three times larger, and runs twice as fast. A PC version is available to the public through the National Technical Information Service. I wouldn’t be surprised to see one on your cell phone before long.

Ron Mills is a Software Architect for the Clinical & Economic Research department of 3M Health Information Systems.