ChatGPT not that great at bar exam after all

beschizza · May 30, 2024, 1:21pm

Originally published at: https://boingboing.net/2024/05/30/chatgpt-not-that-great-at-bar-exam-after-all.html

…

LurksNoMore · May 30, 2024, 1:29pm

And/or OpenAI just blatantly lying what their plagiarism engine could do.

anon61221983 · May 30, 2024, 1:40pm

shocked philip j fry GIF

sandrodacruz · May 30, 2024, 1:48pm

If they have lied in this case, a high profile, peer reviewed research paper is REALLY bad place to do so, if they planned to get away with it. Academics are going to have fun ripping them to shreds.

I mean, sure, they will still make all the money… gain unprecidented power and influence.
But, yeesh… those acedemics… vicious.

robertmckenna · May 30, 2024, 1:52pm

When have they not lied?

I was exasperated when these stories kept on coming out in the last couple of years as there is no way you could pass the shit they write, let alone give a high grade to it.

I can understand why the Sam Altmen and Lone Skums of the world could imagine this is believable, as they peddle bullshit all the time and believe bullshit all the time.

Why do outlets faithfully reprint the publicity material of Potemkin AI?

sandrodacruz · May 30, 2024, 1:56pm

It’s a research paper. The whole purpose of it’s publication is to have its truth ascertained.
Sure, they can lie in the press as much as they like, but here its going to get found out.

johnnyFresh · May 30, 2024, 1:57pm

How does it compare with an untrained human doing the test with acces to Google.

Jesse13927 · May 30, 2024, 2:00pm

I have some bad news for you…

robertmckenna · May 30, 2024, 2:04pm

No. Its purpose is to lend scent of validity to a press release.

sandrodacruz · May 30, 2024, 2:08pm

Quite possibly. That was the point I was wavering my way towards i guess.

Anyhow, I respectfully withdraw from this debate, owing to the ‘Logic chopping fallacy’ and my snark against open AI being insufficiently vitriolic for the room.

theophrastus · May 30, 2024, 2:18pm

Any judgement of an ‘A.I.’ based on a “closed book” style of exam is utterly antithetical to their construction. And/or as long as there’s mega-money associated, all summary reviews of ‘A.I.’ performance are best assumed to be written by 'A.I.'s

bluehenbear · May 30, 2024, 2:24pm

Barbri has very tight control of its bar exam study materials. Maybe they didn’t get slurped up during their LLM training?

danimagoo · May 30, 2024, 2:53pm

So…they’re giving ChaptGPT a bar exam for which the results are already available online, including model answers for the essay questions? Is that what they’re doing? Because I could fucking nail the bar exam if I had an answer key to begin with!

ETA: Also, the bar exam is hot garbage and fucking pointless. There is no correlation to bar passage and actual competence in practicing law. Essentially the same questions get recycled year after year, with specific details of the hypothetical questions changed, most of which are bizarre made up situations that are unlikely to occur in the real world, especially the Property Law questions. Bar Prep consists largely of practicing by answering questions from old exams over and over and over again, and practicing the writing skills for the essay questions. Of course this is something a LLM AI should be good at. But again, there is zero correlation to doing well on the bar and doing well practicing law.

And in case you couldn’t tell, I am currently studying hard for the bar exam, and I might be a teensy bit biased and bitter.

MonkeyT12 · May 30, 2024, 3:54pm

Since AI learns from what it reads and has demonstrated little to no discretion about determining the actual validity of that input, I would assume the more it reads, the less likely it is to know fact from fiction.

mmascari · May 30, 2024, 8:16pm

Like all good deception, it’s in the details and has a basis in truth. One where the details are lost in the headline, and in small text later.

From the abstract (and I heard it on a radio story so knew to look for it)

So, not a complete lie. ChatGPT did score in the 90th percentile of some population of test takers. The deception and misleading is in what that population they’re comparing it to. In this case, mostly people who failed the exam and are taking it a second time. A condition that is conveniently left out of the headline and one where people assume something completely different. Creating a very different and incorrect perception.

“ChatGPT does better on bar exam than most people taking it a second time after failing the first time” doesn’t sound as impressive.

Would you trust a car that self parks if it only parked better than most people who failed their drivers test the first time? With no knowledge about how it compared to all the people who take drivers tests.

robertmckenna · May 30, 2024, 8:35pm

I see what you are saying but, and I don’t know US Bar exams but I do know a bit about other ones and it seems to me that @danimagoo ’s point about using a ‘bot trained on the answers is, and forgive me if you think this is hyperbolae but as someone who works in legal education and educational quality assurance I feel a tiny bit invested in being able to use the word, cheating.

It’s Potemkin AI.

It always is.

Every decade.

Every year.

Every month.

mmascari · May 30, 2024, 9:05pm

Definitely. Completely agree. The original assertion about 90% was 100% designed to deceive and make the product look more capable.

These things don’t “understand” anything. They’re good at pattern matching. Giving it all the prior tests and answers and then asking it use those patterns against the new one should be an easy problem. That they have to fake it to make it look good doesn’t give high hopes.

robertmckenna · May 30, 2024, 10:08pm

Word.

Melizmatic · May 30, 2024, 10:10pm

system · June 4, 2024, 1:22pm

This topic was automatically closed after 5 days. New replies are no longer allowed.

Topic		Replies	Views
Lawyer fabricates brief using ChatGPT, then doubles down when judge wants details of the fake cases it cited boing	30	1396	June 3, 2023
Remote bar exams for aspiring attorneys are a terrible and dangerous idea boing	28	1474	July 24, 2020
Steve Wozniak says ChatGPT is useful, but "it can make horrible mistakes" (video) boing	23	1305	February 15, 2023
In a challenge to Google, Microsoft is adding ChatGPT to Bing boing	34	1235	January 9, 2023
Teacher devises an ingenious way to check if students are using ChatGPT to write essays boing	83	4288	April 4, 2024

ChatGPT not that great at bar exam after all

Related topics