186

AI chatbots unable to accurately summarise news, BBC finds (www.bbc.com)

submitted 1 month ago by misk@sopuli.xyz to c/technology@lemmy.world

49 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[-] brucethemoose@lemmy.world 11 points 1 month ago* (last edited 1 month ago)

What temperature and sampling settings? Which models?

I've noticed that the AI giants seem to be encouraging “AI ignorance,” as they just want you to use their stupid subscription app without questioning it, instead of understanding how the tools works under the hood. They also default to bad, cheap models.

I find my local thinking models (FuseAI, Arcee, or Deepseek 32B 5bpw at the moment) are quite good at summarization at a low temperature, which is not what these UIs default to, and I get to use better sampling algorithms than any of the corporate APis. Same with “affordable” flagship API models (like base Deepseek, not R1). But small Gemini/OpenAI API models are crap, especially with default sampling, and Gemini 2.0 in particular seems to have regressed.

My point is that LLMs as locally hosted tools you understand the mechanics/limitations of are neat, but how corporations present them as magic cloud oracles is like everything wrong with tech enshittification and crypto-bro type hype in one package.

[-] 1rre@discuss.tchncs.de 7 points 1 month ago

I've found Gemini overwhelmingly terrible at pretty much everything, it responds more like a 7b model running on a home pc or a model from two years ago than a medium commercial model in how it completely ignores what you ask it and just latches on to keywords... It's almost like they've played with their tokenisation or trained it exclusively for providing tech support where it links you to an irrelevant article or something

[-] brucethemoose@lemmy.world 1 points 1 month ago* (last edited 1 month ago)

Gemini 1.5 used to be the best long context model around, by far.

Gemini Flash Thinking from earlier this year was very good for its speed/price, but it regressed a ton.

Gemini 1.5 Pro is literally better than the new 2.0 Pro in some of my tests, especially long-context ones. I dunno what happened there, but yes, they probably overtuned it or something.

[-] Imgonnatrythis@sh.itjust.works 1 points 1 month ago

Bing/chatgpt is just as bad. It loves to tell you it's doing something and then just ignores you completely.

load more comments (6 replies)

this post was submitted on 11 Feb 2025

186 points (98.4% liked)

Technology

68247 readers

2346 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws