Originally published at: https://boingboing.net/2019/06/26/breaking-goodharts-law.html
…
Maybe also when picking a KPI, shop it around and see if anyone can come up with adversarial examples. This is KPI’s in general, not just in software.
As you said, when you pick a KPI people are going to optimize for the metric, and not the underlying principal. Make sure it’s something that when they hit the numbers, it is difficult to hit the number without putting in the work to match the underlying principle.
Another thing is who does the measuring.
Example: Customer Satisfaction in Tech Support. If you let the support manager create the questionnaire and determine who gets it, you’re likely to get a better rating than if it was administered by a disinterested party.
Anyone with the MBA mentality is absolutely obsessed with boiling everything down to One Magic Number on which they can make decisions without taking accountability for them. These are usually based on derivatives of a number of KPI-like scores, mixed together in a black box that no-one wants to peer into too closely.
baloney
How about the case in which people get in, get out, and get on with their lives? Engagement is a good proxy for opportunities to advertise IMO and that is about it.
The very first time I went to turn off a windows computer, it bothered me that they used an error dialog to ask, “are you sure you want to discontinue engaging with our wonderful product?” To this day, I think the industry is confused about the difference between productivity and activity.
All the stories of runaway AI seem to hinge on the lack of an off switch. Its kind of creepy to see that bias reflected in real world designs.
It seems to me that you can improve “engagement” by making the UI worse so that it takes longer to do what you need to do. It doesn’t take a genius to see how that is likely to end up.
And as alluded to in the URL, the phenomenon of measures becoming useless when they are turned into targets isn’t new:
Big Tech: yesterday’s problems automated.
I have a similar rant about some Android apps, where if you leave by pressing the back button, the app gets all needy with “Are you sure you want to leave?” and makes you press it twice. YES, I want to leave, dammit!
Hey, here’s a radical idea: how about we educate people generally on how to write/design/create software? That way, the peoples of the world can roll their own shit, designed specifically for stuff they actually do. They would, like, uh, not have to shop out software needs to the lowest bidder because they wouldn’t be trapped in the abyss of not knowing how all that digital stuff works. Also, that makes the only fucking “KPI” worth knowing about as it should be. It’s called “eating your own dog food”.
Yeah, this means no more Oracle. You’re welcome.
Don’t forget the Dashboard that Visualizes Metrics on a Single Pane of Glass.
Having one of those is to statistics what the Tory Power Stance is to statesmanship…
But they would have to spend a few thousand hours writing each thing they needed. Human culture survives on specialisation and cooperation.
Still, there’s always Postgres instead of Oracle. I only wish my work would make the change…
Are you familiar with a couple of of HP Lovecraft’s later and more experimental works(co-authored with Bill Gates); “At the Macros of Madness” and “The Whisperer in Access”?
Or, The Doom that Came to Puppet, for that matter.
The flip side here is the frustrating experience of ‘wait I clicked the wrong thing- welp, my computer shut off.’ not all validation boxes are sinister.
I’m actually curious about this, as the work I do (crisis and problem management) relies pretty heavily on industry standard KPIs with readily accepted measurements. These are all generally pretty consumer-friendly too, as it all comes down to restoring the user experience as quickly as possible. It’s generally been my view that teams that take the KPIs seriously deliver a better and more robust product, and teams that don’t take these seriously fail to provide a quality product. I wonder if this is a design problem or if I’m missing the forest for the trees.
Some basic examples (these are all generally ITIL based):
Time to Detect: From the beginning of an issue, how long did it take your team to notice it was occurring?
Time to Engage: From someone detecting it, how long did it take to get the right person to look at it?
Time to Fix: From getting that person involved, how long did it take to actually fix the problem?
Time to Mitigate: From start to end of impact, how long did the event last?
There’s no weird fuckery on these: they all are valuable, they all fit together and are co-responsible, and spending development effort on driving them down makes devs and customers both happier. Setting targets for them can drive weird behavior, but I haven’t seen it ever devolve into the sort of hell that drives things like Facebook and instagram existing.
Nearly every soft- or hardware-provider we work with has started to answer ticket mails with AI generated replies. So they keep the promised reaction time, but the first reply is nearly always useless. The next step is usually the same occupational therapy: collect all the log files, run this configuration dumper, do this, do that, even if you can pinpoint the problem and would like to talk to a second- or third-level supporter right away. And the final survey (yes, sometimes I actually do it) is usually constructed in a way, that it is hard to formulate your dissatisfaction, because usually it is not the individual support technician you are angry at, but this whole process.
I suspect that this is why you don’t see things devolving into facebook hell.
@Jorn_C has some good examples of situations where theoretically customer-focused problem resolution can be unnecessarily unpleasant (for a mixture of metric-meeting and minimum-cost compliance; depending on the quality of your support agreement); but you are much less likely to go in really unpleasant directions when your goals are basically aligned with those of your customer. (I know that my own experience has included a fair amount of “So can we consider this ticket closed within the scope of proposed resolution???” questioning from techs who are likely laboring in dystopian metrics sweatshops, so I sympathize; but could probably close the ticket faster if they were able to spend more time on the problem and less time on classifying the problem as solved.)
There’s certainly room for being led astray by myopic adherence to the metrics without evaluation of what they actually mean and why we choose them; but when the fundamental incentives are “customer wants problem fixed; we want to fix customer’s problem”, that means a potential for inefficiency rather than a downright adversarial outcome. When the incentive is “must increase MAU and ‘engagement’ at all costs” there isn’t the same more or less shared desired outcome and that is when one party’s work at optimization becomes effectively adversarial.
Scientology heavily uses a management by statistics system.
In fact, that’s the name of their software package. (Mastertech is a Scientology front company.)
Good point. I think the better idea is to have a “Don’t show this again” checkbox (and a control panel item to re-enable it).
Yeah, the more I think on this the more I definitely see the issue - the customer service targets are definitely an easy way to think about it. “Fill out this survey!” etc etc.