|
|
||||||||
J Thorac Cardiovasc Surg 2004;128:807-810
© 2004 The American Association for Thoracic Surgery
Statistics for the Rest of Us |
a Section of Clinical Research, Department of Thoracic and Cardiovascular Surgery and Department of Biostatistics and Epidemiology, The Cleveland Clinic Foundation, Cleveland, Ohio, USA
Received for publication February 24, 2004; accepted for publication March 5, 2004.
* Address for reprints: Eugene H. Blackstone, MD, Department of Thoracic and Cardiovascular Surgery, The Cleveland Clinic Foundation, 9500 Euclid Avenue, Desk F24, Cleveland, OH 44195, USA
blackse{at}ccf.org
| See related articles on pages 811, 820, 823, and 907.
|
If surgical performanceoften measured by postoperative outcome of initial hospital stayis monitored at all, the most common means is by risk-adjusted annual or semiannual audit. Observed occurrence of outcome measures (eg, in-hospital death and complications) as a proportion of cases performed is compared with expected performance using, for example, the Society of Thoracic Surgeons' regression equations1 or EuroSCORE,2 which account for many aspects of case mix. Sometimes observed (O) and expected (E) proportions are subtracted, sometimes divided (O/E ratio)3; sometimes confidence limits of these comparisons are provided, and occasionally P values are given.
Is this periodic, widespread, but rather coarse monitoring of surgical performance sufficient?
| CUSUMswhat are they and why should we care? |
|---|
|
|
|---|
No sooner had this decision been made than (1) Gary Grunkemeier's tutorial on CUSUMs appeared in The Annals,6 and (2) we received a tutorial by Rogers and colleagues7 from Bristol. We determined that the latter would become the centerpiece of an educational package on monitoring surgical performance, along with invited commentaries from Tom Treasure and his group from Guy's Hospital and the Clinical Operational Research Unit, Department of Mathematics, University College London, and David Spiegelhalter from the Medical Research Council Biostatistics Unit, Cambridge.
We trust that this material, together with Grunkemeier's Annals presentation,6 will provide you, the reader, with a comprehensive idea of the "state of the art" in surgical performance monitoring. We have purposely retained controversy, even potentially inflammatory statements, because the field of quality monitoring in medicine, and even in industrial settings, is still evolving.
| Some things to look for |
|---|
|
|
|---|
History
Although Rogers and colleagues7 claim that Williams and colleagues8 first proposed using CUSUMs in a medical context, this is untrue. CUSUM techniques have been used quite effectively in medicine for at least 35 years, and control charts have been used for at least 50 years.9 Initially they were used mostly to monitor quality of clinical chemistry laboratory measurements,10,11 but in 1977 The New England Journal of Medicine published Herbert Wohl's article, "The CUSUM Plot: Its Utility in the Analysis of Clinical Data."12 Wohl illustrated use of CUSUM charts for detecting subtle body temperature changes in patients being treated for sepsis. I mention this article not just because of the prestigious journal in which it appeared, but to contrast continuous monitoring with that of single outcomes of discrete patients. Because temperature measurements can be recorded continuouslyjust as can thickness of a rolled sheet of steelsustained temporal trends can be detected quickly. But as Rogers and colleagues7 point out, CUSUM techniques may require years of patients in low-volume settings to detect performance problems measured as binary outcomes.
Although de Leval and colleagues13 are credited with introducing nonrisk-adjusted CUSUM charts to cardiac surgeons, as pointed out by Treasure and colleagues in their commentary, often missed are two other ideas introduced in their report. First, they dealt with the problem, dismissed by Rogers and colleagues,7 that traditional CUSUM techniques have no memory loss. If continuous monitoring of a program is being suggested, how long is it necessary to remember and equally weight results of the past? de Leval and colleagues suggested using an exponential memory loss, called "exponentially weighted moving average" (EWMA) charts by Spiegelhalter in his commentary. Second, they introduced a form of risk adjustment that may still be valuable. They simply superimposed on their CUSUM results of observed outcome an expected CUSUM outcome calculated from an external14 risk-adjustment equation (their Figure 4).13
Performance improvement
Both Rogers and colleagues7 and the commentators use two phrases that may be unfamiliar to readers: common-cause variation and special-cause variation. Common-cause variation is the natural fluctuation of performance measures that results from multiple factors underlying any complex process, such as health care, that is considered to be in control. James Reason,15 in discussing human error, and W. Edwards Demming,16 in discussing industrial processes, emphasize that nearly all improvement in results or product come from reducing common-cause variation (Reason calls it the "blunt end"). Special-cause variation is fluctuation in results that are attributed to those aspects of the process over which there is presumed to be some extrinsic influence, such as that of the surgeon. Reason argues that improvement in this source of variation at the "sharp end" of patient care delivery is most effective in a non-culpable atmosphere, because things are rarely as simple as a single individual to blame.17 (An alternative is to institute mechanisms to insulate the process from blunt-end systems.)
Performance measures
Not discussed in depth by Rogers and colleagues7 nor by the commentators are appropriate measures of performance. It is possible that several hospital outcomes should be simultaneously monitored, and this is what Spiegelhalter has termed "multiplicity." Silber and colleagues18,19 have emphasized the difficulties of selecting outcome measures that reflect controllable variation and are not confounded by patient factors. The fact that risk-adjustment methods are advocated by all the discussants indicates that the outcomes selected for monitoring are thought to be strongly confounded by patient and disease characteristics. Unfortunately, risk adjustment tends to be particularly incomplete when there are rare or multiple measured, unmeasured, or unevaluated risk factors present,20 so the search for adequate unconfounded quality measures should go on.
Response speed
We have already alluded to the difference in response speed to underlying trends when a continuous variable is monitored, such as temperature (the kind of things often measured in industrial quality control) as opposed to one value from an entire operative result. The discussants have focused on boundary crossing methods to detect these trends. Yet CUSUM charts, in contrast to a number of other kinds of chart, are considered most valuable for detecting a change in slope.4,21
Some have hyped CUSUMs as instantaneous warning systems for undesired outcomes. Untrue. I agree with Lim22 that we need more sensitive and responsive warning systems, but if mortality is the performance measure, do not expect it to provide instantaneous warning.
Comparison of surgical programs
The Society of Thoracic Surgeons, its European counterparts, and governmental agencies compare not only individual surgeon performance but institutional performance as well. Most models for monitoring have been constructed without taking into account institutions and surgeons.1-3 Subsequently, models are applied on an institution or surgeon basis. Tekkis and colleagues,23 in the setting of gastroesophageal cancer surgery, use hierarchical (mixed) modeling that permits direct assessment of institutional performance while simultaneously modeling underlying risk factors. This approach is explained in an accessible tutorial by Christiansen and Morris,24 as cited by Spiegelhalter. It is an attempt to model simultaneously both special-cause and common-cause variation. Such an approach, even if performed only periodically, has considerable merit.
Simplicity and intuitiveness
Consistent with other recent developments in statistical quality control, Shewhart's original sketch of a quality control chart at Bell Telephone Laboratories on May 16, 1924, was simple and intuitive.25 Figure 1 represents the kind of simplicity originally envisioned for control charts. An in-control process marches horizontally down the centerline, staying out of areas of alarm. The underlying mathematics of Rogers and colleagues' hypothesis testing approach (verified by the mathematics in their Appendix)7 (1) do not require that in-control processes march down either a centerline or a fixed slope corresponding to the in-control observations and (2) do require an in-control process to march toward an acceptance boundary. Neither behavior is simple or intuitive. Similarly, I find Grunkemeier's bullet-shaped prediction limit approach (which some might mistake for control limits) equally nonintuitive, because the limits present moving targets dependent on number rather than standards of performance.6 The most intuitive chart to my eye is the observed minus expected chart for which Rogers and colleagues7 do not display boundary lines.
|
I am not sure what to believe, frankly, nor do I think this issue will be soon resolved. However, Storey's work at Stanford University on false discovery rates28 and Aylin and colleagues' work29 seem to be promising and fresh approaches to this problem, as noted by Spiegelhalter.
What's next?
CUSUMs are not the end of the road for techniques that may be useful for surgical performance monitoring. My digital signal processing background conjures up visions of applying sophisticated pattern recognition techniques, such as wavelet kernels, to identify underlying trends and transients. Might optimal statistical outlier identification methods be an alternative approach? Algorithmic technologies may yield yet other methods.30
The shocker
Most readers of the tutorial and commentaries will be surgeons or physicians involved in health care delivery. Particularly in a litigious society, health care workers want to be given the benefit of the doubt. On the other hand, when the tables are turned and you become the patient, would you not find it shocking that the Society of Cardiothoracic Surgeons of Great Britain and Ireland interprets "benefit of the doubt" to mean 9999:1 odds of adverse outcomes being attributable to chance alone before calling those results into question? Protecting our own reputations versus protecting our patients' lives involves a delicate and sensitive balance between being an alarmist and being insensitive.22 Tipping the balance decidedly in favor of our own interests versus those of our patients can only toss fuel onto a fire that is burning up the public's confidence in the medical profession. On the other hand, monitoring programs that fail to recognize that systems, not individuals at the sharp end of the process, should be the prime targets for quality improvements will continue to concentrate on sniffing out "bad eggs." They address the proverbial speck in the eye rather than first removing the plank in the eye of the blunt end of medical care.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
G. L. Grunkemeier, R. Jin, and Y. Wu Cumulative sum curves and their prediction limits. Ann. Thorac. Surg., February 1, 2009; 87(2): 361 - 364. [Full Text] [PDF] |
||||
![]() |
S. C. Stoica, D. Kalavrouziotis, B.-J. Martin, K. J. Buth, G. M. Hirsch, J. A. Sullivan, and R. J.F. Baskett Long-Term Results of Heart Operations Performed by Surgeons-in-Training Circulation, September 30, 2008; 118(14_suppl_1): S1 - S6. [Abstract] [Full Text] [PDF] |
||||
![]() |
B Bridgewater and B Keogh Surgical "league tables" Heart, July 1, 2008; 94(7): 936 - 942. [Full Text] [PDF] |
||||
![]() |
D. M. Holzhey, S. Jacobs, T. Walther, M. Mochalski, F. W. Mohr, and V. Falk Cumulative sum failure analysis for eight surgeons performing minimally invasive direct coronary artery bypass J. Thorac. Cardiovasc. Surg., September 1, 2007; 134(3): 663 - 669. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. A. Larrazabal, P. J. del Nido, K. J. Jenkins, K. Gauvreau, R. Lacro, S. D. Colan, F. Pigula, O. J. Benavidez, F. Fynn-Thompson, J. E. Mayer Jr, et al. Measurement of Technical Performance in Congenital Heart Surgery: A Pilot Study Ann. Thorac. Surg., January 1, 2007; 83(1): 179 - 184. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. J. Novick, S. A. Fox, L. W. Stitt, T. L. Forbes, and S. Steiner Direct comparison of risk-adjusted and non-risk-adjusted CUSUM analyses of coronary artery bypass surgery outcomes. J. Thorac. Cardiovasc. Surg., August 1, 2006; 132(2): 386 - 391. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Lacour-Gayet Editorial comment: The goal is performance evaluation not outcome prediction Eur. J. Cardiothorac. Surg., June 1, 2006; 29(6): 989 - 990. [Full Text] [PDF] |
||||
![]() |
B Guthrie, T Love, T Fahey, A Morris, and F Sullivan Control, compare and communicate: designing control charts to summarise efficiently data from multiple quality indicators Qual. Saf. Health Care, December 1, 2005; 14(6): 450 - 454. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Lacour-Gayet, J. P. Jacobs, D. R. Clarke, J.W. Gaynor, M. L. Jacobs, R. H. Anderson, M. J. Elliott, B. Maruszewski, P. Vouhe, and C. Mavroudis Performance of surgery for congenital heart disease: Shall we wait a generation or look for different statistics? J. Thorac. Cardiovasc. Surg., July 1, 2005; 130(1): 234 - 235. [Full Text] [PDF] |
||||
![]() |
B. Bridgewater and on behalf of the adult cardiac surgeons of north w Mortality data in adult cardiac surgery for named surgeons: retrospective examination of prospectively collected data on coronary artery surgery and aortic valve replacement BMJ, March 5, 2005; 330(7490): 506 - 510. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ANN THORAC SURG | ASIAN CARDIOVASC THORAC ANN | EUR J CARDIOTHORAC SURG |
| J THORAC CARDIOVASC SURG | ICVTS | ALL CTSNet JOURNALS |