Friday, July 24, 2015

The Surgeon Scorecard: My analysis

I've got nothing against ProPublica. If a valid way to rate surgeons is ever discovered, I would support it completely. However, ProPublica's Surgeon Scorecard is not the answer.

I keep hearing its defenders say, "Some data is better than no data at all." I disagree strongly with that. To me, bad data is worse than no data at all. People with much more statistical sophistication than I have pointed out the flaws in the scorecard.

Digression: Having written many posts about statistics, I can tell you that the mere mention of the word drives readers away about as fast as if you were to yell "Fire" in a crowded theater.

I want to focus on a different area. The scorecard has created a lot of chatter on Twitter, and just about everyone I know has blogged about it.

This reminds me of a couple of posts I wrote back in 2011. [Links here and here.] I pointed out that Twitter might not be as important as those of us who use it think it is.

While we were busy arguing about the merits of the scorecard on Twitter, I'm not so sure what the general public was doing.

For example, ProPublica says the Surgeon Scorecard has had over 1 million visitors since its launch. That sounds like a lot until you consider that the current population of the United States is estimated at 321 million. So 1 million people would be 0.3%. We do not know how many of those 1 million were unique visitors. It could be that many of them were doctors looking for their own statistics and bloggers looking for ideas.

That the public may not care was reinforced by a rather tepid response to the ProPublica AMA (Ask Me Anything) on Reddit today.

By 1:00 PM EDT, which was two hours into the AMA, there were 80 comments, 31 of which were by ProPublica staff or the spine surgeon who had consulted on the scorecard's methods.

Just to give you some perspective, an AMA last year by a guy with two penises drew 17,134 comments.

Because the demographic is skewed toward younger people, perhaps Reddit may not have been the right venue. Although Reddit boasts 169 million unique visitors per month, the most recent figures show that 33% of the Reddit users are mostly men between 18 and 49 years old. Those under 18 are not counted but represent "a substantial percentage of Reddit users."

My two favorite questions asked of ProPublica were "How can I tell if my doctor is capable of making an error?" and "Do you fix the leg which is broken completely?" [Did the question refer to a leg that was completely broken, or did it mean should the leg be completely fixed?]

What have we learned here? It's hard to say.

If you want to read a measured critique of the scorecard, go to Dr. John Mandrola's piece on Medscape.


A. Banterings said...

"Do you fix the leg which is broken completely?"

Good catch. That is like the "Have you stopped beating your wife" question.

You are correct that most surveys are not worded correctly. Here is one that I was asked: "What is your sexual preference?"

I responded, "Two women..." That illustrates the absurdity of the question. BTW, almost all surveys with that question do NOT have an answer for "I do not have sex." (Sorry for this imagery, but the 77 year old catholic nun.)

That also leads me to ask what is the criteria for judging physicians and who decides that criteria? Ask how a physician defines a good physician and a patient defines a good physician, and you will get two very different answers.

If it is based on outcomes (evidence) then does extending a terminal patient's life by 6 months with tubes everywhere, pain, being comatose, qualify as a good outcome or allowing them to spend 2 weeks with family, say their goodbyes, and die with dignity at home constitute a good outcome?

I would say it depends who you ask.

So social media sites by patients are going to judge a physician differently than by an industry standard (written by physicians), which will be totally different than a government score (which is probably only looking at outcome and money spent).

All measures are equally valid (assuming data integrity). It just depends on the searcher's definition of a good physician. Most likely they will take multiple criteria in to account.

Skeptical Scalpel said...

A., all good points. Thanks.

Post a Comment