Tuesday, July 9, 2013

Out of Body Experience - Anatomical Information in Images

Summary: Anatomical information is sometimes hard to come by in images, but it's not as bad as you might expect.

Long Version.

Information about the anatomic region included in a set of images is useful for a number of obvious reasons.

First and foremost, whether the user be an imaging specialist, a clinician who performs their own imaging, a referring practitioner who has requested imaging or is interested in procedures already performed, or a radiographer/technologist about to begin a new procedure, if one is browsing through a patient's record trying to find the "right" images(s) to answer some clinical question, anatomy, together with modality and approximate date are useful.

A related use case, and one which is largely behind the scenes but impacts the quality of the user experience, is to pre-fetch images for any of the first set of use cases, and as we discussed last time, pre-fetching is back in vogue for one reason or another.

Hanging protocols are another application, particularly for longitudinal comparison of complex procedures that involve multiple parts, e.g. skeletal surveys.

So where does the anatomical information come from, in terms of who populates it and in which data elements?

In an ideal world, the anatomy would be implicit in a standard procedure code that was supplied in the request from the order entry system, which might be refined somewhat during the "protocolling" step in the RIS, then fed to the modality via the modality worklist, amended by the operator if they need to perform something other than what was requested, and then recorded in the images and the performed procedure step, and included in the reports. This procedure code, being standard, would have a standard mapping to its related concepts, i.e., what the general anatomic region was, and what the anatomic focus was.

Though such standard procedure codes do exist, in SNOMED, LOINC and more recently in RadLex (which has recently been extended to include CR/DX and NM), they aren't widely used. Indeed, as far as I can tell, they aren't used at all yet. In over a decade of performing international multi-center cancer clinical trials in my last job at RadPharm/CoreLab Partners/BioClinica, I never saw a standard value in the Procedure Code Sequence data element of any image (with the occasional exception of US CPT-4 codes, which though arguably "standard" are billing not ordering codes). Most often there was nothing there, or sometimes illegal empty values or garbage dummy values. If anything was present, it was a private or local code.

That said, there does seem to be reliable standard anatomic information in the image headers a large proportion of the time.

The history of this begins with the original DICOM standard released in 1993. Prior to that time, there were no data elements defined for describing the anatomy in the ACR-NEMA standards (of 1985 and 1988). DICOM introduced the Body Part Examined data element at the Series level, primarily for use with projection radiography (CR at the time). The original list was relatively short, 16 defined terms, ABDOMEN, ANKLE, BREAST, CHEST, CLAVICLE, COCCYX, CSPINE, ELBOW, EXTREMITY, FOOT, HAND, HIP, KNEE, LSPINE, PELVIS, SHOULDER, SKULL, SSPINE, TSPINE. Being defined terms, vendors (and users) are permitted to extend this list, as long as they don't duplicate the meaning of an existing term, and for example Fuji CR describes in its conformance statements also sending HEAD, NECK, LOW_EXM, UP_EXM and TEST.

How did CR modalities obtain a value to populate this data element? Simple, they asked the operator. In the case of Fuji CR, the image processing and parameters applied to make an interpretable image are body part specific, and so the operator selection serves multiple purposes, applying the right processing and populating the DICOM data element. Over time, more general image processing algorithms have evolved that may not require anatomical information, but as X-Ray generators and tubes have become integrated, the body part specific selection of X-Ray technique factors provides another source of this information.

The Digital X-Ray object, introduced in 1998, both to support digital detectors and to improve upon the CR object in DICOM, went one step further and "coded" the anatomy more formally. I.e., rather than using a single string value, a triplet of coding scheme (e.g., SRT for SNOMED), code value (e.g., T-04000) and code meaning (e.g., "Breast") were used in a data element called Anatomic Region Sequence. A list of SNOMED codes for useful anatomic regions was provided, longer this time, 73 if I have counted those listed in Supplement 32 correctly. Included was a mapping from the "older" Body Part Examined string values to to the new SNOMED codes, the list of standard values having grown slightly in the interim.

Some of these new codes remained at the same general level of specificity as the historical Body Part Examined values, e.g., (T-D3000, SRT, "Chest") and CHEST. Others were very specific and for particular uses of radiography, such as to support particular views (e.g., (T-61300, SRT, "Submandibular Gland") to describe submandibular sialograms); others were specialty-specific (i.e., support was added for not only general radiography, but also mammography and dentistry). As an aside, a much more rich description of the projection or view was also added, including codes for epnymous views (such as (R-102AE, SRT, "Waters"), etc.). The approach used at the time was to go through the classic projection radiography textbooks, enumerate all documented techniques, describe their anatomy and other dimensions, and add data elements and coded values for each, and then iterate with radiologists and applications specialists to assure comprehensive coverage. Some implementers expressed skepticism about burdening the console/QC station/plate reader operator, but with education about the possibility of using integrated generator/gantry information to capture the data, and the need to orient the image correctly and document its orientation, progress was made. I used to preach about this in my RSNA Refresher Course on Digital Radiography.

Over the years, all subsequent new DICOM image objects have been defined to use Anatomic Region Sequence, but Body Part Examined remains popular, and has been retrofitted with standard string values for a broad range of purposes, and the list now contains 112 standard values (including the aforementioned examples of GALLBLADDER and SUBMANDIBULAR). This has been done largely in recognition of the fact that the CR object has not gone away (despite the DX object being superior in every way, though I am not biased at all). Sadly, many PACS and viewers are still too dumb to handle coded triplets for display or switching. To be fair, if a PACS or viewer is going to allow the user or site to customize behavior based on some of these values, it is easier to develop a configuration user interface that allows them to enter plain text strings to match, rather than force them to think about codes or choose from a pre-populated drop down list of SNOMED codes (that may not be up to date).

The list of body parts and anatomic region codes has been extended to cover the cross-sectional modalities too. In the early days, there was absolutely no indication of body part in CT and MR images. The standard described the use of Body Part Examined in the General Series module, so it was available, but you may recall that there was nowhere in the user interface on the console to enter it. There was no cutesy little homunculus to point and click to select the protocol, in which the anatomy was implicit. Before the days of modality worklist, there was no place to copy it from (not to say that anatomy is explicit in MWL either, but it can be derived from the Requested Procedure Code, or Scheduled Protocol Code Sequence, or nowadays the Protocol Context Sequence). Indeed, there were no standard protocols and one had to select (or type in) all the technique parameters individually every time. The best one could hope for was something meaningful in Study Description (more on that later).

CT and MR operators nowadays have it pretty easy by comparison, and as vendors have made the user interface more automated and graphical and intelligent, more information has become available for re-use. Many contemporary CT and MR modalities are indeed populating Body Part Examined and/or Anatomic Region Sequence, using values derived from operator protocol selection (and in some cases IHE Assisted Acquisition Protocol Setting).

Ultrasound is a tricky modality, being so operator-dependent in terms of positioning, as well as requiring discipline in terms of selecting from the user interface a description of each captured image. After an abortive attempt in the original DICOM standard to define encoding of ultrasound images, which included stuffing body part information into a value of the Image Type data element, a much cleaner Ultrasound IOD was quickly released, in Supplement 5. It was one of the first to use the Anatomic Region Sequence with codes, as described earlier, thanks to the influence of Dean Bidgood. Unfortunately, it seems that very few, if any, ultrasounds devices actually provide a means for the user to populate this attribute. Nor is Body Part Examined populated in ultrasound as far as I can tell.

Which brings us back to the question of reality. What does one actually see in real world image objects received from various sites? Are these Body Part Examined and/or Anatomic Region Sequence being populated? Do they contain standard values or non-standard strings or codes? Even if they are populated, are they correct and reliable?

The bottom line seems to be that in this day and age, for many modalities, they are often being populated, and if populated they are much more often using standard rather than non-standard values, and appear to be reliable when populated. This may be contrary to some peoples' beliefs or observations, but I can only report my own experience in this respect. As I mentioned before, in my former cancer clinical trials life, I had the opportunity to monitor images from literally thousands of sites around the globe, for most modalities (ultrasound being a major exception), from all vendors and vintages of machine. I can't report exact figures, but on several occasions in the past I examined what we were receiving to ascertain the feasibility of various efforts to improve the workflow of comparing successive time points, for both projection radiography and nuclear medicine bone scans as well as cross-sectional modalities.

In general, for projection radiography with CR, Body Part Examined is populated with a standard value about 75% of the time, is empty or absent about 10%, and contains a non-standard value about 15% of the time. Spot checks on individual images showed that the value sent is rarely incorrect.

This is surprisingly good for CR perhaps, which one might expect to be the least reliable, given the ease with which some vendors allow their sites to customize what can be put in there. If one inspects what non-standard customized values are being sent, they fall into a couple of categories:
  • local language equivalents, e.g., BASSIN rather than PELVIS, BRUSTKORB rather than CHEST
  • extensions that include the view too, e.g. CHEST_PA
  • reasonable values that we should probably add to the standard list, e.g., FOREARM
  • incorrectly spelled equivalents, e.g. "L SPINE" with a space or "L_SPINE" with an underscore, instead of the standard "LSPINE"
  • incorrectly capitalized equivalents, e.g., "Chest" instead of "CHEST"
  • literal copies (sometimes capitalized) of some procedure or billing code, e.g., "CHEST 1 VIEW" or "XR ACUTE ABDOMEN W/PA CXR"
Not infrequently, non-standard values are not only non-standard, they are illegal. The CS (Code String) value representation does not permit lowercase or most special characters or accents, for example, and is limited in length to 16 bytes.

I can see why non-English-speaking sites are tempted to replace all the codes with local language equivalents, since the literally encoded value may be displayed in some modality and PACS user interfaces, or at least in some configuration screens, such as for hanging protocols. But they really shouldn't, since the standard values are supposed to be used regardless of the locale, and the user interface should perform the translation. This is just a bad, though understandable, practice.

One of the strengths of using Anatomic Region Sequence instead of Body Part Examined is that it is local language independent and one can send, and recognize, the same code value, regardless of the code meaning. I.e., one can send (T-D3000, SRT, "Chest") or (T-D3000, SRT, "Thorax") or (T-D3000, SRT, "Tórax") or (T-D3000, SRT, "Brystet") or (T-D3000, SRT, "胸郭") and they all mean the same thing. The idea is that hanging protocols, routers, pre-fetchers or just ordinary human readable browsers should recognize the code (T-D3000, SRT) and render to the user whatever is the locale appropriate string. The code meaning encoded in the message is only there as a fall back in case the code is unrecognized (and indeed it used to be optional in DICOM when coded tuples were first introduced). Theoretically; unfortunately, the lowest common denominator in localization of PACS and viewing applications is probably not up to substituting code meanings yet, probably as a result of user's having higher priorities than localization (or their requirements not being taken seriously by the vendors).

For cross-sectional modalities, given their history, I was expecting a lot worse than I actually observed. For CT, for example, about 60% of the time there is no value sent. No surprise there, but it could be much worse, and this is a sign of improvement. About 35% of the time there is a standard value, and about 5% of the time there is a non-standard value. For MR one sees values much less frequently; roughly 85% of the time there is no value, 10% a standard value, and 5% a non-standard value. For PET though, neither Body Part Examined or Anatomic Region Sequence are ever sent, which is pretty lame (how hard is it to send the code for "whole body" anyway?).

Nuclear medicine is a mess. Like the ultrasound objects, the NM objects were revised early and redefined to include Anatomic Region Sequence. One standard value one sees fairly often is ("T-11000", "SRT", "Skeletal") for whole body bone scans, not surprising in an oncology practice. For historical reasons, the coding scheme may be "99SDM" or "SNM3" rather than "SRT", the price NM pays for being an early adopter of coded tuples. That said, one also sees a lot of private codes from one particular vendor, who sends "99NMG" for the coding scheme, and then sends codes that include not only the anatomy but also the view, which is the wrong thing to do since there is a separate coded data element for that.

Interestingly, I do not see very many combined body parts showing up, apart from TLSPINE. This is probably a consequence of the fact that Body Part Examined is a Series level attribute (and Anatomic Region Sequence is image level). In other words, two different Series in a single Study may have different values for these attributes. This is important to account for if one wants to come up with a single anatomic descriptor of the entire procedure, so a system may need to have the ability to detect and combine these. DICOM defines a bunch of these combined parts, and adds more as they are conceived of (for example, I recently realized we don't have a good value for aortic arch plus carotids plus circle of willis for MRAs). There is a trivial example of how to do this using the available combinations defined in DICOM in com.pixelmed.anatproc.CombinedAnatomicConcepts in my PixelMed toolkit if you are interested; i.e., one doesn't need the complete SNOMED ontology to recognize the relationships, only a tiny subset of it (more on that in a later blog post perhaps).

On the subject of tools as well as limited structured anatomical information, I cannot finish without mentioning Study Description and its ilk, Series Description and maybe even Protocol Name. Worse even than non-standard string values in Body Part Examined, these descriptive data elements can contain anything at all. Indeed that was their intent, to be a human readable description, and not something that was machine recognized. Originally, the modality operator typed in free text values, and often they still have that flexibility, or at least the ability to edit what is pre-populated by protocol selection. Sadly, given that Study Description and Series Description are the most frequently populated data elements in practice, though they are incredibly useful for human browsing, it has become common place to try to match or parse their content to dictate downstream behavior, such as for hanging protocol selection or matching.

Anyhow, given a site-specific set of such description data element values, one can either parse them and try to find anatomic words or phrases, in order to be adaptable to local variations, or one can just do a straight match on the entire string. In order to better support some of my use cases, particularly extracting anatomy for radiation dose extraction projects, I spent a while working on the parsing descriptions problem, with some success. You can find in the com.pixelmed.anatproc package a bunch of attempts to do this, both for cross-sectional and projection radiography, as well as for multiple languages. By comparison, you might want to look at the RadMapps approach, which just does a straight out full string mapping, which requires one to build a mapping once for any sites' list of descriptions, and then maintain it as they evolve. This is the approach being used for the ACR's Dose Index Registry, for example, where they only have to cover a small subset of all possibly procedures. In these approaches, there is some blurring between purely anatomical information and other interesting things one might want to also extract, like why the procedure is being performed or the particular manner in which it is being performed (such as being a CT angiogram or being thin slice, etc.), but the anatomy is a key part of the process. For some use cases it may not even be necessary to extract the anatomy separately, since the goal may be to map to a particular standard procedure code.

Indeed, one might suspect that the primary reason for the popularity of VNAs and the dreaded "dynamic tag morphing" is to deal with the impedance mismatch between the way different vendors and sites have their modalities populating Study and Series Description and the limited configurability of some PACS hanging protocols that depend on these. Of course, I hate to say it, but the "dynamic tag morpher" is probably a good tool to do the extraction or matching of descriptive attributes to populate structured attributes with standard codes for procedures and anatomy, if it has the sophistication required; i.e., use it not just to "clean up" descriptive attributes, but to augment the header with codes extracted from them. Better of course would be to get it right "first", i.e., off the modality or fixed during ingestion, and for everyone to use the same standard codes as the interoperable set, rather than have to "dynamically" coerce the values to match varied expectations of the recipients.

The bottom line is that reliable anatomical information is almost certainly available somewhere, if you want to go to the trouble to extract it, in decreasing order of desirability, increasing order of difficulty, and increasing order of likelihood of availability, from:
  • implicit in a standard Procedure Code Sequence value, supplied by the worklist and encoded in the header
  • in a standard Body Part Examined value or Anatomic Region Sequence code, extracted from the worklist procedure code, automatically or operator selected protocol, or operator selected dropdown
  • extracted by matching or parsing the Study and/or Series Description or Protocol Name data element string values
David

PS. Before someone asks, in DICOM, laterality is conveyed separately, encoded in either Laterality or Image Laterality (or in some cases Frame Laterality), and not pre-coordinated with (built in to)  the Anatomic Region Sequence or Body Part Examined. The opposite is true for Procedure Code Sequence, which has no separate laterality modifier, and for which laterality needs to be pre-coordinated.

1 comment:

SOFA said...

Very educating and well written! Thanks for sharing your insight in this "exotic" matter.. The topic of anatomic relations of medical images seems to be asked for by everyone but used by few and customized by everyone (who actually uses it). The fact that many PACS environments today grow into VNA solutions does not make this any easier.. :)