ChatGPT not that great at bar exam after all

Originally published at: https://boingboing.net/2024/05/30/chatgpt-not-that-great-at-bar-exam-after-all.html

3 Likes

And/or OpenAI just blatantly lying what their plagiarism engine could do.

7 Likes

shocked philip j fry GIF

6 Likes

If they have lied in this case, a high profile, peer reviewed research paper is REALLY bad place to do so, if they planned to get away with it. Academics are going to have fun ripping them to shreds.

I mean, sure, they will still make all the money… gain unprecidented power and influence.
But, yeesh… those acedemics… vicious.

1 Like

When have they not lied?

I was exasperated when these stories kept on coming out in the last couple of years as there is no way you could pass the shit they write, let alone give a high grade to it.

I can understand why the Sam Altmen and Lone Skums of the world could imagine this is believable, as they peddle bullshit all the time and believe bullshit all the time.

Why do outlets faithfully reprint the publicity material of Potemkin AI?

6 Likes

It’s a research paper. The whole purpose of it’s publication is to have its truth ascertained.
Sure, they can lie in the press as much as they like, but here its going to get found out.

3 Likes

How does it compare with an untrained human doing the test with acces to Google.

4 Likes

I have some bad news for you…

8 Likes

No. Its purpose is to lend scent of validity to a press release.

7 Likes

Quite possibly. That was the point I was wavering my way towards i guess.

Anyhow, I respectfully withdraw from this debate, owing to the ‘Logic chopping fallacy’ and my snark against open AI being insufficiently vitriolic for the room. :slight_smile:

1 Like

Any judgement of an ‘A.I.’ based on a “closed book” style of exam is utterly antithetical to their construction. And/or as long as there’s mega-money associated, all summary reviews of ‘A.I.’ performance are best assumed to be written by 'A.I.'s

2 Likes

Barbri has very tight control of its bar exam study materials. Maybe they didn’t get slurped up during their LLM training?

1 Like

So…they’re giving ChaptGPT a bar exam for which the results are already available online, including model answers for the essay questions? Is that what they’re doing? Because I could fucking nail the bar exam if I had an answer key to begin with!

ETA: Also, the bar exam is hot garbage and fucking pointless. There is no correlation to bar passage and actual competence in practicing law. Essentially the same questions get recycled year after year, with specific details of the hypothetical questions changed, most of which are bizarre made up situations that are unlikely to occur in the real world, especially the Property Law questions. Bar Prep consists largely of practicing by answering questions from old exams over and over and over again, and practicing the writing skills for the essay questions. Of course this is something a LLM AI should be good at. But again, there is zero correlation to doing well on the bar and doing well practicing law.

And in case you couldn’t tell, I am currently studying hard for the bar exam, and I might be a teensy bit biased and bitter.

15 Likes

Since AI learns from what it reads and has demonstrated little to no discretion about determining the actual validity of that input, I would assume the more it reads, the less likely it is to know fact from fiction.

2 Likes

Like all good deception, it’s in the details and has a basis in truth. One where the details are lost in the headline, and in small text later.

From the abstract (and I heard it on a radio story so knew to look for it)

So, not a complete lie. ChatGPT did score in the 90th percentile of some population of test takers. The deception and misleading is in what that population they’re comparing it to. In this case, mostly people who failed the exam and are taking it a second time. A condition that is conveniently left out of the headline and one where people assume something completely different. Creating a very different and incorrect perception.

“ChatGPT does better on bar exam than most people taking it a second time after failing the first time” doesn’t sound as impressive. :man_shrugging:

Would you trust a car that self parks if it only parked better than most people who failed their drivers test the first time? With no knowledge about how it compared to all the people who take drivers tests.

1 Like

I see what you are saying but, and I don’t know US Bar exams but I do know a bit about other ones and it seems to me that @danimagoo ’s point about using a ‘bot trained on the answers is, and forgive me if you think this is hyperbolae but as someone who works in legal education and educational quality assurance I feel a tiny bit invested in being able to use the word, cheating.

It’s Potemkin AI.

It always is.

Every decade.

Every year.

Every month.

2 Likes

Definitely. Completely agree. The original assertion about 90% was 100% designed to deceive and make the product look more capable.

These things don’t “understand” anything. They’re good at pattern matching. Giving it all the prior tests and answers and then asking it use those patterns against the new one should be an easy problem. That they have to fake it to make it look good doesn’t give high hopes.

5 Likes

Word.

1 Like

2 Likes

This topic was automatically closed after 5 days. New replies are no longer allowed.