Looks like the Deepmind Starcraft AI relied on superhuman speed after all

Lol. Was not a tubgirl trick.

https://motomatters.com/opinion/2009/07/22/the_truth_behind_the_rossi_leg_wave.html

2 Likes

There were similar complaints when computers started beating top humans at Chess and Go. In the end those concerns were minor speed bumps. I’m sure with a little more time and learning, they will able to win consistently even with a lower rate cap, more restrictions on their camera, more maps and more races.

The other concern, that the computer has developed cheese strategies that lean on having a speed advantage, makes no sense given how the system was trained. The “development league” that the systems played in were all other AIs that played at the same speed - so it’s likely an unexpected outcome for the AIs when humans lose even battles, not something the AI counts on in planning.

I also think the analysis is over-counting the advantage of moving fast here. There’s been lots of AI agents made for Starcraft over the years, many of which have attempted the kind of strategy that’s being imagined here - a linear, static approach that leverages insane speed to overwhelm humans. There’s AI agents that micro 8 groups of mutalisks with 1500 APM… and they haven’t worked against pro players in the past. This is because Starcraft is largely a game of strategic “macro” decisions; it has a large decision space, lots of hidden information, and huge rewards for implementing a unit mix that is advantageous against what your opponents have. With generally capable play, “micro” speed advantages act more like tie-breakers.

Anywho, I think this is a very impressive accomplishment with some small caveats - not a trick or a cheat.

1 Like

Shoddy reporting. The original post has phrases like: “This is pure speculation”, “I am a layman”, and as for SC2 and AI, “I do not claim to be an expert in either topic”, “I can’t prove all of my core claims”.

But in fact, they could try to prove their claims, as the replays are available for analysis on the official website. They just don’t and instead just kinda make things up. They say in one paragraph that AlphaStar imitates “spam clicking” at a very deep level, but in another, they said when they limited APM during training, it did no micro at all. So which is it? If it’s just learning spam clicking, then why does the author feel the ability to claim that we can say that 100% of AlphaStars clicks are accurate and useful, but we can immediately discount 50% of the human players clicks?

In the demo, the professional commentators repeatedly pointed out flaws in the micro control of AlphaStar where it attacked it’s own units. So there is this weird situation where the author is suggesting without evidence that AlphaStar has perfect micro and never wastes clicks, but the facts are waiting in the video to show that’s not even true.

Both the original author, and the reblogger here, have really wiffed.

1 Like

IMHO the proof is in the pudding. The AI went whole hog on Stalkers, which are a unit that can be nightmarishly effective with perfect micromanagement. They have a twiddly teleport ability that can be abused by a computer to make sure it never loses a unit in combat. Whenever the shield is close to depleted it can teleport to safety and recharge the shield.

Notably the human that eventually beat it did so by exploiting its total love of Stalkers by building a hard counter unit that can three shot the Stalkers and has no projectile delay and using them in groups of three.

Basically the AI figured out a total cheese tactic that only works because it has perfect knowledge of the situation and can click perfectly in realtime.

That’s not to say it would be totally impossible for a human to micromanage a fight to a frightening degree, but it would require total concentration and they couldn’t do it on two fights at once, at least not nearly as effectively as the AI can.

Yeah, that’s ridiculously unfair. If the AI can use information other than what a human would see on the screen and hear through the audio then that’s absolutely cheating. It should be figuring out what’s going on by watching blobs on the mini-map. If humans were allowed to build custom interfaces that showed multiple screens at once they would use them.

I also think that playing on a “slightly outdated” version of the game might matter. I don’t know exactly what that phrase means, but pro players need to stay on top of the actual game, not remember strategies from six months ago.

Maybe this is what you mean by idle-loop behaviour. I always interpreted click spamming as being a little like how an athlete waiting at their end of the field doesn’t stand rigidly still or lie down. I’m not going to claim to properly understand the mechanism, but it does seem easier to stay in motion and then redirect that motion towards a useful activity then to stay at rest and suddenly have to come out of rest to do something. (Somewhat related: This is one of the reasons I’ve always been very skeptical of human “back-ups” for AI driven cars. But the time the backup realizes they need to do something they will be too late to do anything)

I tend to agree with Rob’s assessment that they “bungled an interesting AI announcement by making claims about it that they didn’t realize were wrong.” Yes, computers can/will beat humans at StarCraft while being limited to human click speed and human-available information. But “this will happen” is different than “this did happen”. If “this will happen” was good enough, then no one would ever start the project to begin with and it wouldn’t happen.


My (facetious) take: The headline that is went 10-1 made me think it had played 11 best of 3 or best of 5 series against 11 different pros. Which, in turn, made me think, “Couldn’t beat Serral, eh?”

1 Like

That section was a quote from the devs about their process:

Initially they tried training it with a tight APM cap. Those bots never learned to micro:

Oriol Vinyals, the Lead Designer of AlphaStar: Training an AI to play with low APM is quite interesting. In the early days, we had agents trained with very low APMs, but they did not micro at all.

The final bot does not play under the APM cap that the original tests did; does have excellent micro; but apparently still engages in periodic episodes of spam clicking; and which, even after training, was never restored to a hard APM cap, instead left with an averaged window.

That’s not a contradiction, it’s a thesis based on the changes that occurred during the development process; to the effect that the bot does not appear to have distinguished well enough between meaningful and meaningless actions to work under a realistic APM cap: if capped low during training it apparently failed to develop(according to its developers); and even the variant that was allowed more liberal training rules, which did learn micro, app apparently couldn’t be weaned back down to a realistic APM cap despite the fact that being capable of perfect precision and having no need to stay warmed up makes spam clicking completely pointless to the bot(where it might have some uses in humans).

Not a chronology that suggests that everything went to plan; though it certainly achieved results better than the micro-dedicated but myopic examples that have been tried.

Exactly my thoughts. Terrible article from an author that has no good understanding how AI works and why the stated issue is a petty complaint instead of a real problem.
And frankly, what they have achieved is so exciting that no amount of hype can do it justice.

1 Like

A review of one of the games.

This topic was automatically closed after 5 days. New replies are no longer allowed.