80
submitted 1 week ago* (last edited 1 week ago) by Sebrof@hexbear.net to c/chapotraphouse@hexbear.net

I burned down a forest to confirm

Don't ask it to name an NFL team that doesn't end with 's'

DeepSeek eventually gets it, but it's DeepThink takes a good ten minutes of racing 'thoughts' and loops to figure it out.

you are viewing a single comment's thread
view the rest of the comments
[-] invalidusernamelol@hexbear.net 22 points 1 week ago* (last edited 1 week ago)

A nice portion of the technical work that goes into these models is maintenance of the facade.

If there's an article written about a specific question, they will really quickly go in and just hardcode an answer to it.

Situations like this shouldn't be taken as specific, but as general criticism of the reasoning methodology behind these models that still hasn't been solved, because the system itself is built in a way that monkey patching statistical anomalies is the only way.

ChatGPT-5 though the web has a temperature greater than 0. The correct behavior is more likely a result of being a non-deterministic system than a concentrated effort to crawl the Internet for articles about bugs and hardcore solutions. The first token in this question will be "yes" or "no" and all further output is likely to support that. Because gpt-5 isn't a CoT model, it can't mimic knowledge of future tokens and almost has to maintain previous output, so there's a good chance of going either way.

this post was submitted on 23 Sep 2025
80 points (100.0% liked)

Chapotraphouse

14112 readers
842 users here now

Banned? DM Wmill to appeal.

No anti-nautilism posts. See: Eco-fascism Primer

Slop posts go in c/slop. Don't post low-hanging fruit here.

founded 4 years ago
MODERATORS