ICD-10: False Friends

December 7th, 2010 / By Ron Mills, PhD

We were asked: is there an easy way for the computer to tell the difference between the codes in the four different code sets (ICD-9-CM diagnoses and procedures, ICD-10-CM diagnoses, ICD-10-PCS procedures). Presumably the asker wanted to store codes in common fields (perhaps a data warehouse record) but wanted a reliable way to figure out what she had after the fact.

The answer depends on whether your codes have the decimal point in them that coders commonly provide, but that computer systems usually omit. The conventions are:

ICD-9-CM diagnoses start with a digit or E or V, and are three to five characters long, not counting the dot. Except for codes beginning with E, a dot may go after the third character if the code has more than three characters. For ICD-9-CM E codes, the dot may go after the fourth character if the code has more than four characters.

ICD-9-CM procedures start with a digit and are two to four characters long. A dot may go after the second digit if the code has more than two digits.

ICD-10-CM diagnoses start with a letter and are three to seven characters long. A dot may go after the third character if the code has more than three characters.

ICD-10-PCS procedures are always seven characters long and allow no dots.

As you can see, it is easy to distinguish between the two coding systems for procedures, with or without decimal points. For diagnoses the situation is trickier: what about the E and V codes? We ran a little analysis and found no overlap among the V codes. All ICD-10-CM codes beginning with the letter V currently have a seventh-character extension, specifying encounter, so they are distinguishable from ICD-9-CM V codes, which never exceed five characters in length.

But we found forty ICD-9-CM E codes which, when decimal points were not used, overlapped. For example

ICD-9-CM E896 means Accident caused by controlled fire in other and unspecified building or structure

ICD-10-CM E896 means Postprocedural adrenocortical (-medullary) hypofunction

Note that if dots were being used, the ICD-9-CM code would remain E896, but the ICD-10-CM would become E89.6.

Warning: this analysis was performed only on the set of about 14,000 ICD-9-CM diagnoses and 69,000 ICD-10-CM diagnoses that may be coded on patient records – the shorter “categories” and “sub-categories” in the diagnosis Tabular hierarchies (and which often appear in documents and/or older data) were not included. Had they been, the overlap counts would have been higher.

ICD-9-CM diagnoses and procedures could never be distinguished from each other without decimal points. For ICD-10-CM, there are currently no overlaps between diagnoses and procedures (since there are so far no seven-character extensions for diagnoses before those beginning with M, and no PCS codes beginning with a character higher than H). However, the rules for code formation do allow such an overlap to occur.

Bottom line: if decimal points are always used, you can tell the four sets apart. Otherwise, no. In French class, they used to warn us about faux amis (false friends) – French words that look like English words (attention!) but mean something different.

Rather than rely on the dots (and waste an extra space for every code), we came up the following scheme when we wanted to store both ICD-9 and ICD-10 in the same places (for example, in the tables that drive our dual-code MS-DRG grouper). Since ICD-10-CM diagnoses always start with a letter, you can put a “9” in front of an ICD-9-CM diagnosis and it will never overlap with an ICD-10-CM diagnosis. Since ICD-10-PCS never uses the letters “I” or “O”, you can put an “I” in front of an ICD-9-CM procedure and it will never overlap with an ICD-10-PCS procedure. Both of these schemes leaves an extra space (since you have seven characters to work with, and ICD-9 never exceeds five). So for readability, we used that extra character too, took out all dots, and adopted the following scheme:

  • If a diagnosis code begins with “9$” then the remaining five characters are an ICD-9-CM diagnosis. Otherwise the entire field is an ICD-10-CM diagnosis.
  • If a procedure code begins with “I$” then the next four characters are an ICD-9-CM procedure. Otherwise the entire field is an ICD-10-CM procedure.

The ICD-9-CM codes are required to wear a name-tag. No need to worry about false friends.

Ron Mills is a Software Architect for the Clinical & Economic Research department of 3M Health Information Systems.