Originally published at: Watch Google's new Gemini AI in mind-blowing demo
…
I wonder if people are forcing this thing to give feedback on their dick pics yet.
If it squeaks, you should probably see a doctor.
I feel like they are trying to make a Spock.
Wow, what a neat toy. It still isn’t AI.
Yeah, if they’re trying to convince us that it’s so next level or whatever, they didn’t use especially practical examples.
i would be very curious to know how long it took to film. it could have been hundreds of sessions to capture all that
Maybe one that’s in kindergarten. That’s what it comes off as here.
That’s my take as well.
Even if it was all in one take, these things are typically designed to walk a very thin line that highlights the good stuff without exposing all the deeply frustrating stuff.
I’ve spent a couple of decades in software. I’ve watched numerous demos and launches from companies where I was employed. Many times, I knew the people who designed the demos. There’s always a bit of stage magic involved.
Yeah, they’re showing us the successes here. Meanwhile Google Bard is apparently consistently hallucinating, at least when it comes to biographies of contemporary public figures, which should be pretty straightforward when they’re essentially just scraping Wikipedia or some bio page somewhere. It really makes me think the actual failure rate here is… substantial.
I wonder if they go for silly/frivolous uses in demos so that when it fails, it doesn’t undermine faith in the tech (and even for the audience not seeing those failures, you don’t even think about the possibility because it doesn’t matter). If it fails to identify a rubber duck, you can laugh it off, but if a demo involves the simulation of an actual use case, and it fails to identify a cancer cell in a slide or a defective part on an assembly line, or a mechanic’s tool or whatever, you realize that kind of failure renders the tech useless at best and actively dangerous at worst.
my take here is cuteness; see, no need to be afraid of your machine-overlord…
Yeah, that’s part of it/related - you neither think about what would happen if it were to fail (in a serious situation), nor if it really succeeded…
and the other worrisome part, re hallucination, what happens when you don’t know if it succeeded or failed?
It certainly meets some of the requirements for AI laid out John McCarthy who was the creator of the term. It’s not an artificial general intelligence, but it is definitely performing tasks that we associate with intelligence in animals.
That’s why its failures make it useless, overall. It really doesn’t have to hallucinate very often to make its output completely unreliable and therefore worthless - if a human has to check all the output, you might as well have used a human to do the work to begin with.
I’m just listening to Matteo Flora (security/IT expert) explaining how the Gemini video is mostly a “fake” - the actual result of the “Ultra” version (announced, but not yet available) on standardized tests are at least suspect.
You can watch him on YT, in Italian with auto-translated English captions.
As a rule, he does not mince words but some more colorful example might get lost in translation.
Here a brief summary (taken from the video description and translated by me):
- The benchmark is with the April version of GPT-4, not the Turbo version which radically changed its capabilities.
- The benchmark compares GPT-4@5 shot (i.e. refining prompts on the same topic) with Gemini@32. Thanks to the d*ck it performs better… (Italian vulgar expression for an obviously expected thing or event)
- The benchmark uses DIFFERENT INSTRUCTIONS for GPT and Gemini, optimizing them for Gemini
- The video borderline false advertisement: in the tests, it’s clearly stated it’s been cut, neither audio or video has been used but only fixed images, the questions are different from the ones in the video and many guiding questions needed to get those answers have been REMOVED. Again, TTTD…
TIL: summery is a real word.