You can call me AI

Here’s the summary I got of this thread:

It was instant, so I’m guessing they submit each thread once a day on the backend to ChatGPT?

I guess I wouldn’t call it creepy — I think these threads are probably scraped for training data anyway, as is everything else that is public, but I don’t think submitting text to GPT adds additional training data — but it does seem… unnecessary?

Why do we need to optimize our reading of BBS threads? It’s not like I’m looking for Cliff Notes of Best Books to improve my productivity. Perhaps all conversations and hangouts should just get replaced by AI summaries, and we’d all be happy to read and chat a lot less?

1 Like

Good questions that I couldn’t begin to answer!

Interesting that unlike other bbs thread summaries I’ve seen, that one includes none of our nyms. Those summaries of what individuals said were often ridiculously inaccurate, so perhaps the program has been tweaked so that it no longer does that?

Edit:

No, it’s still doing that.

2 Likes

image

3 Likes

That would make a good gif with the hands squeezing a shrinking version of the book.

1 Like

If it was instant it means someone TL3+ already ran the AI summary and you are seeing the most recent result.

1 Like

waitwaitwait, it summarizes the thread?!? thats…so. stupid. :face_with_peeking_eye: I cant even…

1 Like
2 Likes
3 Likes

Art tablet company uses AI generated art in their ad and gets an earful from their customers

2 Likes

Kettle discussion, ca. 18 mins:

2 Likes
1 Like
1 Like

They keep calling it training data, and no doubt that they build complex tables from it, but I think that the full source material still present in their live data sets. Otherwise, carefully selected prompts wouldn’t spit out near verbatim copies, and the repeated word trick wouldn’t do random source dumps.

Rather than one arguably* legitimate use to build the training tables, they’re probably referring to the original material every time it builds an answer touching on it. I think that would really hurt them in copyright cases, so they call it training data.

 * arguments based on “imagine if our program worked like a person” seem kind of dodgy.

4 Likes


“now draw the rest of the fucking owl”

5 Likes

It really isn’t like that, because at the end of the day it’s just a neutral network. It doesn’t look up anything while it’s answering, it has nothing more than the weights between the different neurons that have been trained to output the next statistically-most-likely word.

If you download an open source LLM, you’re getting nothing more than a collection of weights, just a long, long list of numbers. But those same LLMs can produce verbatim prose copied from somewhere else as well.

But researchers have known for a while that neutral networks can overfit their training data, spitting out the exact same data that has been given to them, because that’s what the network has been rewarded for doing.

3 Likes

And it depends on the quality of the training. Like the stories of the neural nets which were “trained to detect hidden tanks”, but failed on new data, because they were actually trained to detect “winter”.

Or this:

Which mentions the training of an AI to detect skin cancers, but it turned out that they’d actually trained it to detect rulers.

5 Likes

cancers… rulers…

Well they are kinda the same thing really.

/jk

4 Likes

Bleh

1 Like