|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
J Thorac Cardiovasc Surg 2006;131:4-8
© 2006 The American Association for Thoracic Surgery
Editorial |
Sentara Cardiovascular Research Institute, Norfolk, Va.
* Address for reprints: Jeffrey B. Rich, MD, Mid-Atlantic Cardiothoracic Surgeons, Ltd, 400 W. Brambleton Ave, Suite 200, Norfolk, VA 23510. (Email: rich{at}macts.com).
The article in this issue of the Journal by the Canadian CABG Surgery Quality Indicator Consensus Panel entitled "The identification and development of Canadian coronary artery bypass graft surgery quality indicators"
1
brings to the forefront an increasingly important issue for health care providers. As a financing crisis looms in the US health care system, both public and private purchasers are demanding more transparency and performance data related to services provided. This is grounded in the belief that improving quality will lead to cost savings, a point clearly made in the Society of Thoracic Surgeons (STS) testimony before the House Ways and Means Committee in March 2005.
2
The Center for Medicare and Medicaid Services (CMS) Director McClellan believes, as do others in the private sector, that payments for health care services should be adjusted according to quality. This has driven the need for the development of specialty-specific quality and performance measures to be used for both quality improvement and accountability. Recognizing this need, the STS, under the leadership of Dr Peter Pairolero in 2004 and Dr Sid Levitsky in 2005, have taken a leadership role in bringing the use of the STS National Cardiac Database (STS NCD) and the STS as a professional society to a position of national prominence. Through the National Quality Forum (NQF) Consensus Development Process, a set of 21 performance measures for cardiac surgery suitable for quality improvement and accountability have been established, with 16 of these measures specified and derived from the STS NCD. The complete description can be found in the NQF publication "National Voluntary Consensus Standards for Cardiac Surgery."
3
The Canadian group has followed suit with an independent project to mirror these efforts in the United States by the STS. A careful comparison of these projects and a discussion of the implications for US cardiac surgeons are imperative.
The Canadian CABG Survey Quality Indicator Consensus panel is well described in their article, with the important point being that 75% of its members were specialty specific, a topic touched on later. The inclusion of Dr Frederick Grover, incoming president of the STS, with 2 decades of experience in measure development initially at the Veterans Health Administration and subsequently with the STS database, gave enormous credibility to this project as did the presence of Dr O'Conner from the Northern New England Cardiovascular Disease Study Group (NNE). The group used a Delphi Consensus process, which is, in essence, blinded voting on the attributes of measures after thoughtful open discussion. The final result included 18 measures covering the spectrum of structure, process, and outcomes, as found in the Donabedian model of quality improvement. Specifically, the set included 14 outcome measures (none risk adjusted), 3 process measures, and 1 structural measure (volume). Arguably, 2 of the 3 process variables (waiting time to surgical intervention and completion of surgical intervention within a recommended waiting time) represent measures of efficiency system capacity rather than processes of care. Additionally, one of the outcome measures, intensive care unit (ICU) length of stay, might also fall into the category of system capacity because transfer out of the ICU might be influenced by lack of step-down or telemetry beds or floor nurse shortages, for example, and might not reflect quality of care. To their credit, they paired this measure with ICU readmission, which enables inappropriate early discharges from the ICU done solely to meet the performance measure to be monitored by readmission rates. Unfortunately, this was not done with ventilation time, which should have had reintubation rates as a paired measure. Importantly, the level of analysis for all of the variables is defined at the hospital level as it is in the NQF measure set, a not insignificant issue. The weaknesses of the data set include the lack of risk adjustment for all outcome variables, the inclusion of only one true process variable, the lack of homogeneity of data source (clinical or administrative is equally acceptable for any measure), and the absence of any recommendation for participation in a systematic database. Each will be discussed in turn.
The reporting of raw outcomes data is a highly contentious issue across medical specialties in the United States. Adequate risk adjustment is considered the sine qua non for outcomes reporting to accurately characterize results on the basis of patient acuity and to prevent the unintended consequence of avoidance of high-risk patients. The NQF project actually considered several of the Canadian outcome measures (eg, postoperative myocardial infarction and blood product use) but rejected them because of the difficulty in defining postoperative myocardial infarction and the complicating factor of widespread oral and intravenous antiplatelet agent use in the United States, making it difficult to use these measures in the absence of risk adjustment. This issue has weighed heavily in discussions at the NQF by many providers and has been a stumbling block in the development of outcomes measures both in the outpatient and inpatient setting. On the other hand, given the time-sensitive needs of the marketplace for data, many believe there must be a starting place for outcome measure reporting. All NQF-endorsed consensus standard sets are meant to be dynamic, and the need for periodic revision is clearly recognized. Hopefully this will occur in the Canadian set. Therefore I applaud the courage of the Canadian cardiac surgeons to begin reporting unadjusted outcomes measures with the presumption that adequate risk adjustment will follow.
Process measures, on the other hand, can be precisely defined and specified and are not subject to risk adjustment. They are ideal measures to include as performance measures if there is a defined link to quality improvement. Exclusion criteria for process measures certainly exist, but one can eliminate the need to create lengthy lists of exclusionary criteria by setting the threshold for the measure not at 100% but at a level based on aggregate national data. For purposes of illustration, internal thoracic artery (ITA) use has known contraindications for use during coronary artery bypass grafting (eg, use in a prior operation, emergency operation for cardiogenic shock, and poor sternal quality). The Leapfrog group, in using this measure as one of their monitors of quality, has approximately 20 exclusions for use of the ITA. The accurate collection of these data is not only difficult and lends itself to "gaming" but also represents a large data burden on providers. If every process measure had as many exclusions and every specialty developed similar measures, the data burden on the system would be enormous and likely impossible to accomplish. The STS, under the leadership of Dr Fred Edwards, Chair STS NCD, developed a simplified approach in its NQF measure set, as described above, and currently is in the process of examining its aggregate national data to determine current percentage use for each of the process measures in the NQF set.
Most importantly, the use of administrative data for in-hospital condition-specific care is fraught with error and ambiguity. In the United States claims data are designed and function well for billing purposes. Beyond that, they represent inaccurate pictures of care delivery at its best and dangerous markers of outcomes at its worst. Analyses by Mack and colleagues
4
and in the blended STS/CMS MedPAR database of the Virginia Cardiac Surgery Initiative have documented wide variations in the ability to accurately capture cardiac surgery outcomes data by using administrative databases. Underreporting of deaths, process measures, and procedural volume is commonplace in the CMS database. When examining the CMS data for the state of Virginia and comparing it with the STS data, 9% of procedures were not reported (discharge Diagnostic Related Groups [DRGs] often do not match procedures performed), mortality rates were lower, and ITA use was underreported by as much as 50% in a single institution with a statewide average of 17%
5
. Perhaps the administrative data in Canada are far better than those in the United States, but nonetheless, the use of administrative data and the lack of an explicit single data source for any one measure in the Canadian set are worrisome. Analysis of outcomes based on those sites submitting clinical data for a given measure will be markedly different than those using administrative data if the pattern of discrepancy seen in the United States holds true in Canada. CMS clearly recognizes this problem and in recent discussions has agreed in principle to accept data from specialty databases in its upcoming Physician Voluntary Reporting Program.
Finally, the STS leadership firmly believes that participation in a systematic clinical database is the foundation of quality improvement. The accurate collection, reporting, and analysis of these data with feedback loops to providers are the essential ingredients for continuous quality improvement. Best practices can be identified and processes of care adapted to improve outcomes. It is only through this mechanism that system transformation can occur at the hospital-physician level, ultimately leading to improved care across all disciplines. Of note, this measure is included in the NQF set.
A direct comparison between the NQF National Voluntary Consensus Standards for Cardiac Surgery and the Canadian CABG Surgery Quality Indicators requires a broader understanding of the NQF. The NQF is a private, nonprofit, open membership organization whose mission is to improve health care through the endorsement of consensus-based national standards for measurement and public reporting of health care data that provide meaningful information about whether care is safe, timely, beneficial, patient centered, equitable, and efficient. The NQF was formed subsequent to a recommendation of a Presidential advisory commission
6
and comports with the requirements of the National Technology Transfer and Advancement Act of 1995 (P.L. 104-113) and Office of Management and Budget Circular A-119. The importance of this lies in the fact that measures, when vetted through the NQF's formal consensus development process, enjoy a special legal status, obligating federal agencies to use the measures as specified if a measure is to be adopted for use by these agencies. The NQF represents the entire spectrum of health care delivery and is an organization of organizations with now nearly 300 member organizations. Members are divided into 4 councils: consumers, purchasers, providers and health plans, and research and quality improvement organizations. When a project is identified and funded, a call for nominations for the steering committee is made. These nominations come from all sectors of health care and not just the specialty involved. Once seated, a call for measures appropriate to the project from any member or nonmember occurs. This allows consumers, purchasers, and all stakeholders to submit measures they deem important. In the case of the Cardiac Surgery project, a technical advisory panel (TAP) was created, chaired by Dr T. Bruce Ferguson. The TAP deliberates and delivers suggested measures to the steering committee. Through a series of multiple face-to-face meetings and numerous conference calls, an agreed upon set of measures is sent to the members and then made open for public comment. Refinements are made as necessary, and the measure set is then sent to all NQF members for a vote. If approved by its councils, the measure set is then sent to the NQF Board of Directors (BOD) for final approval. The BOD membership is impressive, with ranking members of 4 federal agencies (CMS, Agency for Healthcare Research and Quality, Joint Commission on Accreditation of Healthcare Organization, and the National Institutes of Health) and the states (state health officers and Medicaid), as well as major purchasers (General Motors and UPS), consumers (American Association of Retired Persons and March of Dimes), and health care experts.
The importance of this lengthy description is to show the stark contrast between the Canadian consensus process and that of the NQF. Specifically, the Canadian Consensus Panel was comprised of 19 members, and as mentioned earlier, at least 75% were specialty specific. This composition mirrored the TAP for the NQF cardiac surgery project (70% specialty specific) but was quite different in composition from the steering committee, where only 6 of 16 members were specialty specific, with the other members representing federal agencies, as well as consumers, purchasers, other providers, and health plans (complete list available in the publication). As co-chair of this project, my understanding of the needs of all of the stakeholders was enormously increased and, at times, eye opening. The need to have the measure set approved by 4 councils and their members and the BOD was challenging, but the process lent itself to the development of a credible set of measures for our specialty with buy-in across the entire spectrum of health care delivery, a daunting task in the absence of the NQF Consensus Development Process. This should not be taken as a criticism of the Canadian project, which was done exceptionally well, but an illustration of the diversity of the input during the NQF project. Because of this diversity, one of the attributes of measures necessary for inclusion in an NQF measure set is that they have been in broad use and "field tested." Therefore some of the measures appearing in the Canadian set were considered but eliminated on these grounds.
In direct comparison with the Canadian set, the NQF set consists of 11 risk-adjusted outcomes measures (7 related to CABG alone and 4 to valve surgery), 8 process measures with links to quality improvement, and, as mentioned previously, database participation as one of its structural measures. The Canadian article does an excellent job of contrasting the 2 sets, noting a 50% overlap across all measures. Most notable, though, is the fact that all 7 NQF outcomes measures for CABG appear in the Canadian set, although with some minor variation in definition specifications. However, none of these are risk adjusted. This begs the question as to why the Canadian consensus panel did not or does not merely adopt the risk-adjusted STS outcomes measures in their measure set. Subscription to the STS NCD cannot be used as an argument because all of these risk-adjusted measures can be calculated on the STS Web site, a direct result of the NQF cardiac surgery project. This would provide Canadian cardiac surgeons and their institutions the comfort of knowing that their publicly reported performance accurately depicts their patient acuity and is statistically credible. It would also allow comparisons between outcomes in the Canadian single-payer system to that of the public-private system in the United States, not a small issue, and certainly one that will attract an enormous amount of attention by health care policy experts on both sides of the border, especially as Canada expands its private health care sector.
The Canadian set does include 2 extraordinarily important long-term measures, those of 1-year repeat revascularization by means of either percutaneous coronary intervention or CABG. These are achievable because of the single-payer system in Canada and the ability to use its administrative data to effectively track these longitudinal outcomes. In the United States these data can only be obtained through the acquisition of data from CMS and all private health plans, which is doable but complicated at present. The issue was discussed extensively during the NQF project and is high on the radar screens of purchasers, as well as health plans. It is widely recognized as a measure of effectiveness of therapy and will begin to weigh heavily in the debate between multivessel percutaneous coronary intervention versus CABG, especially when costs are considered. As important (or perhaps more so), this represents a battleground for our specialty, and therefore the need for these types of data is crucial. As you will note, this entire issue was made a research recommendation in the final NQF publication, and the STS NCD is currently in discussion with the Duke Clinical Research Institute to develop a method to capture these data.
The additional presence of 2 measures of efficiency in the Canadian set is notable (waiting time to surgical intervention and completion of surgical intervention within a recommended waiting time) and represents system capacity, and therefore these are not likely applicable in the United States. However, measures of efficiency related to cost of care delivery are believed to be sorely needed in the United States, and both the NQF and Agency for Healthcare Research and Quality are in the early phases of project development to define the appropriate attributes of such measures. Purchasers and health plans are very intent on the development of such measures and would welcome the input of providers in doing so. It is clear that the health maintenance organization cost-containment debacle through restriction of services should not be repeated, and therefore it is incumbent on us as a professional society to help develop these measures with the clear proviso that they be linked to quality. The STS has recently submitted a proposal to Congress for funding of a project entitled "Quality Focused Cost Containment in Cardiac Surgery," which will address the development of measures of efficiency done in a patient-centered, quality-guided, and scientifically credible manner.
Nothing that has been written to this point should have been particularly surprising or contentious. The STS and others, like the NNE, have been developing quality-performance measures for more than a decade, and their use is widely adopted for internal use for continuous quality improvement. What has not been embraced widely is the concept of accountability, which for most in health care translates into public reporting of data for use by consumers, purchasers, and payers. Outside of states in which mandatory reporting is already occurring (New York, Pennsylvania, California, and Massachusetts to name a few), cardiac surgeons have pushed back the public release of data. Perhaps more controversial is the level of analysis at which those data are reported (surgeon vs hospital) and the format in which those data should/will be reported (specified measure data vs rating vs ranking of providers). I include the word "will" here because as you will soon hearas I have heard from all corners of health care and most policy makers"this train has left the station." In the current health care financing crisis, in which costs are increasing at multiples of inflation, purchasers are demanding more data from health plans on the quality of the health care for which they are paying. As stated earlier, there is widespread belief that costs will track quality. CMS and private payers are moving in the direction of "value-based purchasing" in health care by the development of programs that will pay for quality, more commonly known as pay-for-performance programs. There are currently 2 bills in Congress, one in the House (HR. 3617) and one in the Senate (S. 1356), that will link physician payment to performance. CMS plans to institute a Pay for Reporting program for physician reimbursement in January 2006. How, when, and if data will be publicly reported are still to be decided. If this program tracks that for the hospitals, which is currently up and running, data will be on the CMS Web site in 18 to 24 months. Where the data will come from is fairly clear. The current design is for the data source to be the CMS administrative claims database, a dangerous enterprise solution for specialty-specific in-hospital care, as was previously discussed. The STS is currently in discussion with CMS about the possibility of allowing clinical databases, and in particular the STS database, to serve as a source of data for its Physician Voluntary Reporting Program.
One might argue that accountability does not necessarily have to translate into public reporting. The STS and the NNE have demonstrated significant quality improvement in the absence of a public reporting system over the past decade. This is a completely true and valid argument, but this has not occurred in the remainder of medicine. Many consumers and purchasers (eg, Leapfrog group) believe that public reporting of performance data will be the only motivating factor for quality improvement. Sadly, they might be right. As a result, cardiac surgeons will be drawn into the whirlwind of the need for public reporting. Beyond that argument lies the argument that publicly released data will allow consumers to make health care provider choices, payers to know that they are obtaining value in their health care purchasing, and health plans to know that their network providers are of acceptable quality. In fact, large corporations are providing incentives to their employees (eg, lower copays) to choose providers on the basis of quality, which argues even more strongly for the need for accurate public data.
It appears that accountability public reporting is on the horizon, and we must adjust to and embrace the concept. More importantly, we must be at the table developing the appropriate reporting format or formats for this to occur in a credible fashion that will be widely adopted.
First and foremost, the level of analysis for cardiac surgery must remain at the hospital level, as is found in the NQF measure set. It is widely recognized that cardiac surgery is a "team sport," with outcomes affected by many providers, as well as the system in which it is delivered. CMS actually was a strong advocate of this position during the NQF project and stated that the most significant opportunities for improvement and cost savings for CMS occur at the system (hospital) level. Reporting might gravitate to the group level at some point, but that discussion is still underway, with no thought to ever move to individual surgeon reporting absent credible scientific data supporting it. We all recognize that New York and Pennsylvania have surgeon-specific reporting, but the benefits of this over hospital-level or group reporting have yet to be demonstrated.
Second, participation in a clinical database with accurate collecting and reporting of risk-adjusted data should provide the level of comfort necessary to believe that these data accurately reflect the performance of a hospital-group. Absent that, administrative data will be used, with the consequent disbelief that we have all felt when seeing data reported erroneously.
Third, we must be prepared to develop credible reporting formats. Reporting of hard data for performance measures, even when statistically adjusted, might prove too confusing for consumers. Designating performance in a particular measure as "less than expected, as expected, or greater than expected" might be more understandable. Alternatively, consumers might likely appreciate and benefit from a "roll-up" measure that combines performance in all measures into a single measure and rates providers on a graduated scale (eg, 1 star through 3 stars). In fact, this last method might work best for health plans as well, and Dr Fred Edwards chairs a STS taskforce that is currently working with Wellpoint/Anthem on determining the validity of such a methodology. Certainly, we must attempt to avoid a system of ranking that would profile providers from best to worst.
The message quite simply is that we as a professional society must take the "fear of the unknown" and translate it into "control of the known." Unless we do it ourselves, I am certain it will be done for us. The NQF Cardiac Surgery project is a perfect example of this principle. Cardiac measures were being developed rapidly without our input. The STS became the first specialty society to approach the NQF with a project that would place us at the table. The results speak for themselves. We now have 16 measures derived from the STS NCD carrying special legal status that can be used to measure quality in cardiac surgery and be used in CMS and private sector pay-for-performance programs.
Our recognition that public reporting is here and that it is a legitimate need of the health care system will again place us in a privileged leadership position among health care experts in this country. We must, however, be certain that it is done properly, as we have done with quality improvement for the past 15 years and performance measurement in the recent NQF National Voluntary Consensus Standards for Cardiac Surgery project. Quality indicators, performance measurement, and accountability are clearly what consumers, purchasers, and health plans want, need, and deserve. This is the right thing to do, at the right time, and for the right reasons.
| See related article by Guru V et al., JTCVS 2005; 130:125764.
|
References
Related Article
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ANN THORAC SURG | ASIAN CARDIOVASC THORAC ANN | EUR J CARDIOTHORAC SURG |
| J THORAC CARDIOVASC SURG | ICVTS | ALL CTSNet JOURNALS |