[-] diz@awful.systems 6 points 1 month ago* (last edited 1 month ago)

Yeah a new form of apologism that I started seeing online is “this isn’t a bubble! Nobody expects an AGI, its just Sam Altman, it will all pay off nicely from 20 million software developers worldwide spending a few grand a year each”.

Which is next level idiotic, besides the numbers just not adding up. There’s only so much open source to plagiarize. It is a very niche activity! It’ll plateau and then a few months later tiny single GPU models catch up to this river boiling shit.

The answer to that has always been the singularity bullshit where the biggest models just keep staying ahead by such a large factor nobody uses the small ones.

[-] diz@awful.systems 6 points 1 month ago* (last edited 1 month ago)

I think I have a real example. Non hierarchical (or, at least, less hierarchical) arrangements. Anarchy is equated with chaos.

Anything in nature we ascribe a hierarchy to; ants or other hymenoptera and termites have supposed "queens", parent wolves are "alphas" and so on. Fictional ant-like aliens have brain bugs, or cerebrates, or the like. Even the fucking zombies infected with a variant of the rabies virus get alphas somehow.

Every effort has went into twisting every view on reality and every fiction to align with the ideology.

[-] diz@awful.systems 5 points 2 months ago* (last edited 2 months ago)

Hmm, maybe too premature - chatgpt has history on by default now, so maybe that's where it got the idea it was a classic puzzle?

With history off, it still sounds like it has the problem in the training dataset, but it is much more bizarre:

https://markdownpastebin.com/?id=68b58bd1c4154789a493df964b3618f1

Could also be randomness.

Select snippet:

Example 1: N = 2 boats

Both ferrymen row their two boats across (time = D/v = 1/3 h). One ferryman (say A) swims back alone to the west bank (time = D/u = 1 h). That same ferryman (A) now rows the second boat back across (time = 1/3 h). Meanwhile, the other ferryman (B) has just been waiting on the east bank—but now both are on the east side, and both boats are there.

Total time

$$ T_2 ;=; \frac{1}{3} ;+; 1 ;+; \frac{1}{3} ;=; \frac{5}{3}\ \mathrm{hours} \approx 1,\mathrm{h},40,\mathrm{min}. $$

I have to say with history off it sounds like an even more ambitious moron. I think their history thing may be sort of freezing bot behavior in time, because the bot sees a lot of past outputs by itself, and in the past it was a lot less into shitting LaTeX all over the place when doing a puzzle.

[-] diz@awful.systems 5 points 3 months ago

Jesus Christ on a stick, thats some trice cursed shit.

Maybe susceptibility runs in families, culturally. Religion does, for one thing.

[-] diz@awful.systems 6 points 3 months ago* (last edited 3 months ago)

I think this may also be a specific low-level exploit, whereby humans are already biased to mentally "model" anything as having an agency (see all the sentient gods that humans invented for natural phenomena).

I was talking to an AI booster (ewww) in another place and I think they really are predominantly laymen brain fried by this shit. That particular one posted a convo where out of 4 arithmetic operations, 2 were "12042342 can be written as 120423 + 19, and 43542341 as 435423 + 18" combined with AI word-salad, and he was expecting that this would be convincing.

It's not that this particular person thinks its genius, he thinks that it is not a mere computer, and the way it is completely shit at math only serves to prove it to them that it is not a mere computer.

edit: And of course they care not for any mechanistic explanations, because all of those imply LLMs are not sentient, and they believe LLMs are sentient. The "this isn't it but one day some very different system will" counter argument doesn't help either.

[-] diz@awful.systems 7 points 3 months ago

Yeah I think it is almost undeniable chatbots trigger some low level brain thing. Eliza has 27% Turing Test pass rate. And long before that, humans attributed weather and random events to sentient gods.

This makes me think of Langford’s original BLIT short story.

And also of rove beetles that parasitize ant hives. These bugs are not ants but they pass the Turing test for ants - they tap the antennae with an ant and the handshake is correct and they are identified as ants from this colony and not unrelated bugs or ants from another colony.

[-] diz@awful.systems 5 points 5 months ago

I seriously doubt he ever worked anywhere like that, not to mention that he’s too spineless to actually get in trouble IRL.

[-] diz@awful.systems 7 points 1 year ago

Frigging exactly. Its a dumb ass dead end that is fundamentally incapable of doing vast majority of things ascribed to it.

They keep imagining that it would actually learn some underlying logic from a lot of text. All it can do is store a bunch of applications of said logic, as in a giant table. Deducing underlying rules instead of simply memorizing particular instances of rules, that's a form of compression, there wasn't much compression going on and now that the models are so over-parametrized, even less.

[-] diz@awful.systems 6 points 1 year ago* (last edited 1 year ago)

I tried the same prompt a lot of times and saw "chain of thought" attempts complete with the state modeling... they must be augmenting the training dataset with some sort of script generated crap.

I have to say those are so far the absolute worst attempts.

Day 16 (Egg 3 on side A; Duck 1, Duck 2, Egg 1, Egg 2 on side B): Janet takes Egg 3 across the river.

"Now, all 2 ducks and 3 eggs are safely transported across the river in 16 trips."

I kind of feel that this undermines the whole point of using transformer architecture instead of a recurrent neural network. Machine learning sucks at recurrence.

[-] diz@awful.systems 5 points 1 year ago

Well I'll be damned, they are trying to get it to work as if it had internal state, by augmenting their training dataset with some auto-generated crap. Of course, if machine learning wasn't shit at recurrence they wouldn't be using transformer architecture, so it doesn't work.

I got the following from chatgpt:

Janet’s ducks lay 3 eggs per day. Each day, she transports one duck and all the eggs to the other side of the river using a boat that can hold her and one other item (either a duck or an egg). How does she do that? ChatGPT

Janet can transport the ducks and eggs across the river using the following strategy:

Day 1 (Duck A, Duck B, Duck C on side A; Egg 1, Egg 2, Egg 3 on side A):
    Janet takes Duck A across the river.

Day 2 (Duck B, Duck C on side A; Egg 1, Egg 2, Egg 3 on side A):
    Janet takes Egg 1 across the river.

Day 3 (Duck B, Duck C on side A; Egg 2, Egg 3 on side A; Duck A on side B):
    Janet takes Duck A back to side A.

Day 4 (Duck C on side A; Egg 2, Egg 3 on side A; Duck A on side A; Egg 1 on side B):
    Janet takes Duck B across the river.

Day 5 (Duck C on side A; Egg 2, Egg 3 on side A; Duck A on side A; Egg 1, Duck B on side B):
    Janet takes Egg 1 back to side A.

Day 6 (Duck C on side A; Egg 2, Egg 3 on side A; Duck A, Egg 1 on side A; Duck B on side B):
    Janet takes Duck A across the river again.

Day 7 (Egg 2, Egg 3 on side A; Duck A, Egg 1 on side A; Duck B on side B; Duck C on side A):
    Janet takes Duck C across the river.

Day 8 (Egg 2, Egg 3 on side A; Duck A, Egg 1, Duck C on side A; Duck B on side B):
    Janet takes Egg 2 across the river.

Day 9 (Egg 3 on side A; Duck A, Egg 1, Duck C on side A; Duck B on side B; Egg 2 on side B):
    Janet takes Duck A back to side A.

Day 10 (Egg 3 on side A; Duck C on side A; Duck A on side A; Duck B, Egg 1, Egg 2 on side B):
    Janet takes Duck B across the river.

Day 11 (Egg 3 on side A; Duck C on side A; Duck A, Duck B on side A; Egg 1, Egg 2 on side B):
    Janet takes Egg 1 back to side A.

Day 12 (Egg 3 on side A; Duck C on side A; Duck A, Egg 1, Duck B on side A; Egg 2 on side B):
    Janet takes Duck A across the river again.

Day 13 (Duck C on side A; Duck A, Egg 1, Duck B on side A; Egg 2 on side B; Egg 3 on side B):
    Janet takes Duck C across the river.

By following this sequence, Janet successfully transports all 3 ducks and all 3 eggs across the river in 13 trips. Each trip involves either transporting one duck or one egg, adhering to the constraint that Janet can only carry one item (duck or egg) with her in the boat.

[-] diz@awful.systems 7 points 1 year ago

The counting failure in general is even clearer and lacks the excuse of unfavorable tokenization. The AI hype would have you believe just an incremental improvement in multi-modality or scaffolding will overcome this, but I think they need to make more fundamental improvements to the entire architecture they are using.

Yeah.

I think the failure could be extremely fundamental - maybe local optimization of a highly parametrized model is fundamentally unable to properly learn counting (other than via memorization).

After all there's a very large number of ways how a highly parametrized model can do a good job of predicting the next token, which would not involve actual counting. What makes counting special vs memorization is that it is relatively compact representation, but there's no reason for a neural network to favor compact representations.

The "correct" counting may just be a very tiny local minimum, with tall hill all around it and no valley leading there. If that's the case then local optimization will never find it.

[-] diz@awful.systems 6 points 1 year ago

Well the problem is it not having any reasoning period.

Not clear what symbolic reasoning would entail, but puzzles generally require you to think through several approaches to solve them, too. That requires a world model, a search, etc. the kind of stuff that actual AIs, even a tik tac toe AI, have, but LLMs don't.

On top of it this all works through machine learning, which produces the resulting network weights through very gradual improvement at next word prediction, tiny step by tiny step. Even if some sort of discrete model (like say the account of what's on either side of the river) could help it predict the next token, there isn't a tiny fraction of a discrete "model" that would help it, and so it simply does not go down that path at all.

view more: ‹ prev next ›

diz

joined 2 years ago