How Metacritic scores manipulate game development

[Read the post]

Goodhart’s law in action.

5 Likes

Article is two years old. Anyone know if this is still a major issue?

Also the main example which opens the article is interesting because it boils down to a sob story about how developers were somehow cheated out of a $1 million bonus due to releasing a buggy, inadequately tested game. Reviewers noted the bugs and scored the game accordingly and the publisher, in keeping with their previously agreed upon stipulations, did NOT pay the developers an extra $1million. (I guarantee a buggy game at release costs a major publisher waaaay more than $1million). This is an injustice and somehow metacritic’s fault?

And then we launch into food metaphors (why do people always use food metaphors whenever they want to be wrong on the Internet?)

I hate how metacritic forces developers to cater to the lowest common denominator by releasing stabile games which are properly play tested and bug-free. The plebeian masses are just the worst. Greasy, salty food!

Well yeah the article is 2 years old. But they recently re-posted it with a few updates so presumable its still an issue.

1 Like

Metacritic is my Yelp. And I’m not surprised that its scores hurt game developers, but the issue isn’t Metacritic, or Yelp, if your business loses money. As a consumer I don’t want to buy trash, and too many times trash is marketed as great (e.g, Gamergate). Metacritic is a force for the buyer to affect change.

1 Like

Of note also is the bugs mentioned were going on being about a decade old at this point and had been present and unresolved in multiple games. It wasn’t like these were unknown bugs.

1 Like

TV has Nielsen. Movies have box office numbers. Music has Billboard. None of these has made it impossible for interesting, independent work to get made. Even politicians have focus groups - but then Trump shows up.

Okay, so the outliers are not always works of genius.

1 Like

This seems like a game-focused replay of most stories of the ‘Hey, wouldn’t it be cool to measure stuff so that we could make informed decisions?’ flavor.

Aside from the rogues’ gallery of statistical noob mistakes, getting useful measurements tends to start out being easy, because people haven’t had time to or incentive to really start gaming the system good and hard; and such measurements are often adequately accurate to make judgements about at least modestly broad tiers of quality.

Once the feedback loop kicks in, though, and inappropriately broad use of the measurement induces frantic gaming of the system; life gets a lot harder and you need to be a lot more careful about what your numbers may or may not mean. See also, basically any attempt to determine if students are learning or not, conduct performance reviews, score pretentious booze, etc.

I’d be inclined to the opinion that metacritic’s “day 1 score, forever” position hasn’t aged well; yes, post-publication pressure is an issue; but online/server-dependent features have grown tremendously(as has the role of patching, or not, even on consoles, which used to be static for the life of the disk), so the policy’s potential for preventing manipulation is, at best, static; while its value for providing information is getting worse all the time; but the rest of the story mostly seems to be something that would play out just as badly with some other number generator.

3 Likes

I’d hope that metacritic isn’t quite as overtly scummy as yelp; and would welcome clarification of the role of doctored reviews in ‘gamergate’; but it does seem to more or less adequately reflect general quality bands among released games, with most of the really good skullduggery being enabled by rampant pre-ordering, which entirely sidesteps the review process in favor of pure ad spend and the occasional carefully-orchestrated PR ‘inside preview’.

Eh. I’ve worked on 4 big games that got metacritic scores, two had bonuses schedules to which is was an input. You know what? The results are all basically fair (the one mild deviation was very predictable and either way no bonus was coming out of that project). They also line up pretty well with my understanding of how the games did financially.

The article is really very unconvincing to me. It starts out lamenting that a game launched with serious bugs and got worse review scores and the developers didn’t get a bonus. Well… yea. Sounds like the system working. Why should a ‘quality bonus’ pay out on product buggy enough to severely damage reviews?

And then one part of it mentions a publisher using metacritic scores to predict sales. If that can be done it is very good evidence that the scores are objectively useful as a proxy for sales.

The thing about scores being used as an ‘excuse’ during negotiations. That is a self-described null point. If it is just an excuse it isn’t important. Easily replaced by some other excuse. The reality is each side in the negotiation has certain power. Much of the time the publisher has more power than independent studios… which is part of why there are hardly any independent AAA studios today. Metacritic wasn’t the cause of that.

We also have the bit complaining that review scores are ‘subjective’. Well, duh. And this is actually what metacritic helps with. It has the objectivity of averaging a bunch of subjective scores. You know what the alternative is, right? No ‘quality’ based money at all, or ‘quality’ being judged by the people who will have to write a check if they give you a high score, or based one a much smaller set of reviewers. It’s not at all perfect, but its better than anything else nearly as convenient.

Much of the rest of it isn’t actually about metacritic itself, it is about poorly structured contracts. If you want to avoid the situation where you miss the metacritic threshold by 1 and your employees lose $15,000 dollars negotiate a smoother payout structure. Maybe you get $2000 at 80, $5000 at 90, and $20000 at 95, linearly interpolating between those. Every point you get, once the game is pretty good, gets more and more money, but no single point is worth a ton so hard feelings are avoided.

Also review score isn’t the only important thing. Release dates are important. Actual sales are important. All of those things can go into game contracts and bonus structures. The problem is, of course, that you can sacrifice quality to hit a release date, or you can miss a release date to increase quality. It’s entirely possible for the publisher to botch the release and severely damage revenue even if the game is great (one great way to do that is to manufacture to many units, so there are a lot of retailer returns). Sales based bonuses also tend to come in a lot later, and we have nothing like the Screen Actors Guild to enforce royalties making the good version of that hard to achieve.

And, yes, if the project is somehow going to get ‘unfairly bad’ metacritic scores… gosh I hope that is part of the plan and therefore part of the contract, because it isn’t going away. Perhaps it targets some other market niche, or has an incredibly strong license, or you made it really cheap, or you have some larger strategic reason for the products existence. Regardless if you expect poor scores any payouts dependent on high scores are worth a lot less during negotiation.

3 Likes

The problem is that a) Metacritic scores take huge hits because of bugs that may not be game-breaking (and don’t alter the fact that it’s a better game than one with a higher rating) and may end up getting fixed in the long run (which doesn’t change the scores) but more importantly, b) sometimes publishers are the ones that deal with the QA, not the developers (but the devs are the ones that get hurt for it, anyways).

The really perverse thing is that Metacritic scores are also being applied to individual developers - they become the sum of the scores of the games they’ve worked on. It’s ridiculous on so many levels. Games get bad scores for reasons that had nothing to do with the whole area of work the developer was responsible for (holding the artists responsible for bad game design, for example), most people who worked on the game aren’t responsible for the decisions made (it’s absurd to hold most of the artists responsible even for the art direction), and someone might work on a game briefly at any time during its development and get credit for their work, so even if they were in a decision-making role, their decisions could later get overruled. It’s a totally, totally meaningless metric.

Working on successfull projects is better for your career than working on unsuccessful ones. This isn’t new, unique, or dependent on metacritic. And the people doing the hiring are game developers, they know that low level developers have little impact on the overall game.

1 Like

Yea, but normal blockbuster games make most of their money in the first few weeks. Anything but a day 1 patch misses the key time period.

It’s an unforgiving process.

1 Like

I have zero insider knowledge of the Fallout: NV situation, contract, relative bargaining power, etc; but having followed Bethesda produced and/or published RPGs pretty enthusiastically, I wouldn’t be at all surprised if Obsidian did indeed get screwed(though, as you say, at contract time, not at metacritic time, NV was damned buggy).

Fallout: NV used a modestly evolved version of the engine(and some of the assets) from Fallout 3, which was done by Bethesda, which itself was based on the engine from TES: Oblivion, which dated back to TES: Morrowind. Not sure if it went back as far as TES: Daggerfall, or if it was a new thing for Morrowind.

To…put the matter kindly… that particular engine didn’t stop sucking until TES: Skyrim(where it still had some serious character; but at least hard locks, save game corruption, and crash to desktop were relatively rare).

Bethesda would have known full well, having done Fallout 3, how much of a screaming pile of technical debt Obsidian was being handed. Obsidian was able to make some improvements(this is why the ‘Tale of Two Wastelands’ fan project imports all the Fallout 3 assets into the NV engine, not the other way around); but if the objective were squishing bugs the product really should have been 'Fallout 3.1", rather than a new game; the engine was just in awful shape.

It’s possible that both parties knew that, and Obsidian was paid for(but blew) the bugfixes; but given the legendary reputation of TES games for egregious bugginess(even the oh-so-AAA Skyrim) I’d be notably unsurprised if Obsidian were largely brought in to build assets and story(which they did, quite elegantly) and fixing the fact that the engine was shot to hell was given insufficient attention.

Again, that doesn’t change the fact that customers got a fairly seriously buggy product(even well after release, in the ‘game of the year edition is now deeply discounted’ phase, there were still serious issues); but the circumstances of the case make me particularly suspect that Bethesda mostly wasn’t interested in paying that million.

This topic was automatically closed after 5 days. New replies are no longer allowed.