Skeptical Scalpel: surgeon scorecard

Showing posts with label surgeon scorecard. Show all posts

Monday, June 18, 2018

Some data is better than no data at all

Do you believe that?

I heard it frequently when the infamous Propublica Surgeon Scorecard first appeared three years ago. Back then I blogged about it saying “To me, bad data is worse than no data at all.”

A recent study in BJU International confirmed my thoughts about this type of publicly posted data and identified a previously unreported issue. The paper attempted to determine whether the public was able to accurately interpret statistics used in the Surgeon Scorecard. It turns out they were not very good at it.

Investigators from the Department of Urology at the University of Minnesota surveyed 343 people who attended the Minnesota State Fair in 2016. Those who took the survey had a median age of 48, were 60% female, 80% white, and 60% college educated. Their median annual income was $26,550 with an interquartile range of $22,882-$32,587.

Why public reporting of individual surgeon outcomes should not be done

Please take a look at my latest post on Physician's Weekly. It's called "Why public reporting of individual surgeon outcomes should not be done."

Click here for the link.

Wednesday, September 2, 2015

Variation is not causation

I made a rookie mistake in statistics of the “correlation is causation” genre by confusing variation for causation in the recent JAMA Surgery paper referred to in my last post. I contacted Dr. Timothy M. Pawlik, the lead author of the Johns Hopkins study, who said the following:

"The model is explaining and attributing variation in readmission and not attributing readmission itself to the different domains. The model suggested that only 2.8% of the variation in readmissions was attributable to surgeons. This is different than saying that only 2.8% were the 'fault' of surgeons. A more accurate interpretation would be that only 2.8% of the variation seen in readmissions was attributable to provider level factors. The majority of the variation in readmission was due to patient factors."

He added that some of the 82.8% variation in readmissions attributable (note: attributable doesn’t mean it’s the patient’s fault) to the patient could be modified by better medically managing patients' comorbidities or not operating on some of these patients.

That readmissions can be explained by a single domain or a single person is simplistic. Dr. Pawlik's clarification confirms my original concern that attributing differences in patient outcomes solely to differences in technical quality of surgeons is probably inaccurate, statistically speaking.

Variation is not causation but variation is still a call to action. Regardless of who is to blame for unfavorable outcomes, surgery is a team sport. The incision is just as important as the community care. In this regard, I am certain that ProPublica and I are on the same side. Let’s work together so that we see the whole story behind the numbers.

Sunday, August 30, 2015

Are surgeons the cause of high postoperative readmission rates?

No, according to a recent paper published online in JAMA Surgery.

The authors concluded, "The majority of the variation in readmission was attributable to patient-related factors (82.8%) while surgical subspecialty accounted for 14.5% of the variability, and individual surgeon-level factors accounted for 2.8%."

The investigators looked at data for over 22,000 surgical patients treated at Johns Hopkins and found the overall rate of readmission within 30 days was 13.2%. After the exclusion of those who performed fewer than 21 operations per year, 56 surgeons made up the study cohort.

Multivariable analysis showed significant non-modifiable patient-related factors associated with readmission were African-American race/ethnicity, more comorbidities, occurrence of postoperative complications, and an extended length of stay.

Variation in readmission by subspecialty ranged from 2.1% after breast, melanoma, or endocrine surgery to 37% following cardiac surgery.

The authors pointed out that this study "echoes growing concerns regarding the use of readmission as a quality metric based on its current methods."

Let's compare it to the controversial ProPublica Surgeon Scorecard.

Both the Surgeon Scorecard and the JAMA Surgery paper used data from the years 2009 through 2013. The scorecard involved only eight high-volume low-risk in-patient procedures while the paper looked at in-patient surgery of all types.

From an article written by the authors of the Surgeon Scorecard: "If a patient was readmitted to any hospital (not just the hospital where the surgery was performed) within 30 days of a surgery for one of the conditions we identified, we counted the case as a complication for the surgeon who performed the initial procedure."

What we learned from the JAMA Surgery paper raises some questions about the the Surgeon Scorecard. On Twitter, I asked for comment from Marshall Allen, the lead author of a white paper [not peer-reviewed] describing the methodology of the Surgeon Scorecard.

Between attacks on my credibility because I choose to use a pseudonym, he said that they did not count most readmissions as complications. It is unclear from the article, the white paper, or its appendices exactly which complications were included. For clarification, we could ask the "surgeon experts" who advised ProPublica, but their names have not been disclosed. They are anonymous, just like me.

According to the white paper, surgeons were blamed for 64,367 (46%) of all complications incorporated into the Surgeon Scorecard. Table 3 of the white paper lists the 20 most frequent complications. The top three, comprising 26,795 complications, were postoperative infection, iatrogenic pulmonary embolism, and infection/inflammatory reaction due to internal joint prosthesis.

Other studies have shown that not all occurrences of those three complications are attributable to a surgeon's misdeed. Among the rest of the top 20 causes of readmission were postoperative pain, fever, and dysphagia (difficulty swallowing)—again possibly not the fault of a surgeon.

So the JAMA Surgery paper says surgeons are responsible for 2.8% of readmissions within 30 days, but ProPublica's self-published white paper says 46% of all readmissions are due to something a surgeon did or did not do.

Who to believe?

Note added at 7:27 a.m. on 9/2/15: See my next post for a clarification about causation and variation.

The full text of the peer-reviewed JAMA Surgery paper is available here.

Wednesday, August 12, 2015

Why in-hospital deaths are not a good quality measure

You may be tired of hearing about the Surgeon Scorecard—the surgeon rating system that was recently released by an organization called ProPublica. Like many others, I have pointed out some flaws in it. You can read my previous posts here and here.

I had decided to stop commenting about it because enough is enough, but a recent paper in the BMJ raises a question about one of the criteria ProPublica used to formulate its ratings.

ProPublica defined complications 1) as any patient readmission within 30 days and 2) "any patient deaths during the initial surgical stay."

The authors of the BMJ paper randomly selected 100 records of patients who died at each of 34 hospitals in the United Kingdom. The 3400 records were reviewed by experts to determine whether a death could have been avoided if the quality of care had been better.

The number of patient records in which a death was at least 50% likely to have been avoidable was 123 or 3.6%.

There was a very weak association between the number of preventable deaths and the overall number of deaths occurring at each hospital. By two measures of overall hospital deaths, the hospital standardized mortality ratio and the summary hospital level mortality indicator, the correlation coefficient between avoidable deaths and all deaths was 0.3, not statistically significant.

From the paper: "The absence of even a moderately strong association is a reflection of the small proportion of deaths (3.6%) judged likely to be avoidable and of the relatively small variation in avoidable death proportions between trusts [hospitals]. This confirms what others have demonstrated theoretically—that is, no matter how large the study the signal (avoidable deaths) to noise (all deaths) ratio means that detection of significant differences between trusts is unlikely."

The Surgeon Scorecard was derived from administrative data. No individual analysis of patient deaths was undertaken. According to a ProPublica article discussing some key questions about their methodology, "As for deaths, we took a conservative approach and only included those that occurred in the hospital within the initial stay."

Maybe that wasn't such a conservative approach after all.

And maybe we need to rethink that 2013 paper claiming that medical error caused up to 440,000 deaths per year.

Friday, July 24, 2015

The Surgeon Scorecard: My analysis

I've got nothing against ProPublica. If a valid way to rate surgeons is ever discovered, I would support it completely. However, ProPublica's Surgeon Scorecard is not the answer.

I keep hearing its defenders say, "Some data is better than no data at all." I disagree strongly with that. To me, bad data is worse than no data at all. People with much more statistical sophistication than I have pointed out the flaws in the scorecard.

Digression: Having written many posts about statistics, I can tell you that the mere mention of the word drives readers away about as fast as if you were to yell "Fire" in a crowded theater.

I want to focus on a different area. The scorecard has created a lot of chatter on Twitter, and just about everyone I know has blogged about it.

This reminds me of a couple of posts I wrote back in 2011. [Links here and here.] I pointed out that Twitter might not be as important as those of us who use it think it is.

While we were busy arguing about the merits of the scorecard on Twitter, I'm not so sure what the general public was doing.

For example, ProPublica says the Surgeon Scorecard has had over 1 million visitors since its launch. That sounds like a lot until you consider that the current population of the United States is estimated at 321 million. So 1 million people would be 0.3%. We do not know how many of those 1 million were unique visitors. It could be that many of them were doctors looking for their own statistics and bloggers looking for ideas.

That the public may not care was reinforced by a rather tepid response to the ProPublica AMA (Ask Me Anything) on Reddit today.

By 1:00 PM EDT, which was two hours into the AMA, there were 80 comments, 31 of which were by ProPublica staff or the spine surgeon who had consulted on the scorecard's methods.

Just to give you some perspective, an AMA last year by a guy with two penises drew 17,134 comments.

Because the demographic is skewed toward younger people, perhaps Reddit may not have been the right venue. Although Reddit boasts 169 million unique visitors per month, the most recent figures show that 33% of the Reddit users are mostly men between 18 and 49 years old. Those under 18 are not counted but represent "a substantial percentage of Reddit users."

My two favorite questions asked of ProPublica were "How can I tell if my doctor is capable of making an error?" and "Do you fix the leg which is broken completely?" [Did the question refer to a leg that was completely broken, or did it mean should the leg be completely fixed?]

What have we learned here? It's hard to say.

If you want to read a measured critique of the scorecard, go to Dr. John Mandrola's piece on Medscape.

Tuesday, July 14, 2015

Big data is not big enough

Today ProPublica released its “Surgeon Scorecard” touting it as the best way to pick the right surgeon.

It took me less than a minute to discover some interesting omissions from the application.

For laparoscopic cholecystectomy, the only general surgery procedure listed, the app omits approximately one-third of the hospitals in my state including two where I have practiced.

It looks like the problem is that using Medicare fee-for-service data does not yield enough surgeons performing 20 or more cases in some categories such as laparoscopic cholecystectomy for the five years included in the database.

At one of the biggest hospitals in my state, apparently only one surgeon performed 20 laparoscopic cholecystectomies on fee-for-service Medicare patients in the five years studied; 23 other surgeons were listed as having performed fewer than 20 laparoscopic cholecystectomies on patients in the target population. I don’t see how patients who want to use that hospital for their gallbladder surgery will benefit from the Surgeon Scorecard.

In general, the complication rate for laparoscopic cholecystectomy is low, but I think I understand why ProPublica chose that procedure to review. They needed to select a procedure that was done frequently enough to yield a sufficient number of cases for analysis. Unfortunately, because of the limitations of the Medicare fee-for-service data and the low complication rate of the procedure, the Surgeon Scorecard is useless for anyone looking to compare general surgeons.

Similar problems with the scorecard may be in play for prostate surgery. Again, the procedure was chosen because of its high frequency, but in quickly looking through some searches in that area, I note that a number of urologists I know also did not perform 20 cases on fee-for-service Medicare patients.

Perhaps the next iteration of the scorecard will utilize a data set that contains enough patient and surgeon records to make a meaningful comparison.

Until then, general surgeons can relax. They will not have to explain away their complications but will simply have to explain why they aren’t listed in the Surgeon Scorecard.

Pages