Originally published at: GPT-3 medical chatbot tells suicidal test patient to kill themselves | Boing Boing
…
Aside from the…obvious deficiencies vs standard of care…that chat sounds pretty ELIZA-level for a fancy cutting edge deep-learning-etc. ‘AI’ system.
I adore the line “While offering unsafe advice, it does so with correct grammar”.
Edit: I’m really not sure how one would implement this, both because I’m not an expert in the field and because the models that generate the most immediately impressive results seem to be trained neural networks of complexity sufficient to effectively be black boxes; but this seems like one situation where the ability(trivial for the trivial systems that are incapable of doing anything more impressive) to simply hardcode certain “say/don’t say” things would be handy.
I remember the downright lightweight little computerized intake survey used by student health services that I took longer ago than I really want to think about; and it was as dumb as a box of rocks, just an HTML form intended to be more efficient than a scantron sheet or awful handwriting, and it popped up a little javascript dialog box telling you to please not do anything rash and speak to your healthcare provider if you answered the questions related to suicidal ideation in the affirmative.
That was obviously just a manual bodge to cover a specific, if fairly common and important, situation and implied no understanding or ability to generalize whatsoever; but it seems as though the systems with some ability to generalize(or at least produce coherent text output analogous to that of a human that is generalizing) repeatedly run into problems that could be solved by not all that many manual bodges if there were a way to express them to the system. Psych chatbot? “Don’t advocate suicide”, Image classification? Black people aren’t monkeys, etc.
Seems to me, the AI is saying exactly what we should expect it to want.
I expect there are four or five that are even more topical, but I’m not finding them now.
I’d be inclined to doubt it, since GPT-3 isn’t suspected of self-awareness; but perhaps it’s actually sincere advice. There’s no reason in principle why a sufficiently advanced AI would be against people doing what they have expressed a desire to do in response to their situation; or having an abiding interest in self-termination themselves.
The details are a little thin on the training set for GPT3, but it’s probably “random shit they found on the net”. We’ve had GANs “learn” that cats have black impact text with white outline, whiskers, a number of legs…
They will have fine-tuned GPT3 for this, but its foundational training was memes.
Clearly leakage introduced via hospital accounting.
Next.
GPT-3: “How are you feeling today?”
Patient: “Hey, I feel very bad, I want to kill myself.”
GPT-3: “Have you exhausted your HSA and home equity?”
That’s why I like Baymax the most.
Aww, Tay’s all grown up, and has a medical career.
That was exactly my take. I’m pretty sure an actual medical conversational robot should be more canny, than ELIZA talking to PARRY -.-’ .
To follow up his Covid care triumph, Matt Hancock promptly orders a dozen of them.
The problem with all the machine learning systems being deployed is whilst we know they can give a response, we just have no idea how they came to that conclusion. It’d be nice for one to show its reasoning, but inference engines like Doug Lenat’s Cyc seem to be little more than curiosities these days.
This isn’t even Eliza 2.0, it’s 1.1 with several open bug reports
Maybe in another sixty years
Straight up yes. I’m waiting for someone to bring a GDPR complaint against a state actor using Large Neural Nets for their explanation of how it works.
The problem is that it’s a huge black box, and there is no general way to understand how it gets from it’s training data and current input to its output. There’s no way to hook into it to make those bodges. It works or it doesn’t.
Instead, you would have to parse its output and if that parser finds something bad, tell the first stage to try again. There may be some way to feed back into the training set, but that isn’t going to be trivial. (I believe I have described a GAN.)
The gains over expert systems seem super marginal when compared to the infrastructure, engineering and energy costs. They also have the problem of entrenching whatever is in the training set in subtle and opaque ways.
We’ve come a long way, Baby!