AI Appears to Rapidly Be Approaching Brick Wall Where It Can't Get Smarter (futurism.com)

submitted 1 year ago by BodyBySisyphus@hexbear.net to c/technology@hexbear.net

45 comments fedilink hide all child comments

The big AI models are running out of training data (and it turns out most of the training data was produced by fools and the intentionally obtuse), so this might mark the end of rapid model advancement

all 50 comments

sorted by: hot top controversial new old

[-] queermunist@lemmy.ml 66 points 1 year ago

Oh look, businesses didn't plan for what to do after the low hanging fruit is gone. Shocker.

[-] context@hexbear.net 45 points 1 year ago

the plan was "and then the line goes up forever"

[-] buh@hexbear.net 6 points 1 year ago

They do have a plan for that, it’s to lay everyone off and use the saved money on stock buybacks

[-] frauddogg@lemmygrad.ml 55 points 1 year ago* (last edited 1 year ago)

While synthetic data is a thing, you've really gotta wonder how often you can train a model on basically empty calories before the hallucination rate starts going up.

I, for one, hope the theftbots die.

[-] KnilAdlez@hexbear.net 24 points 1 year ago

I was reading an article about how ChatGPT will sometimes go on existential rants and I figure it's probably because so much of the training data is now generated by LLMs and posted on the internet. probably a glut of people posting "I asked chatGPT what it was like to be a robot" and things of that nature.

[-] SacredExcrement@hexbear.net 7 points 1 year ago

Hopefully they die off before the entire net is just an all consuming ouroboros of this LLM generated garbage

[-] JoeByeThen@hexbear.net 43 points 1 year ago

No, it's not. Maybe strictly for LLMs, but they were never the endpoint. They're more like a Frontal Lobe emulator, the rest of the "brain" still needs to be built. Conceptually, Intelligence is largely about interactions between Context and Data. We have plenty of written Data. In order to create Intelligence from that Data we'll need to expand the Context for that Data into other sensory systems; Which we are beginning to see in the combo LLM/Video/Audio models. Companies like Boston Dynamics are already working with and collecting Audio/Video/Kinesthetic Data in the Spatial Context. Eventually researchers are going to realize (if they haven't already) that there's massive amounts of untapped Data being unrecorded in virtual experiences. Though I'm sure some of the delivery/ remote driver companies are already contemplating how to record their Telepresence Data to refine their models. If capitalism doesn't implode on itself before we reach that point, the future of gig work will probably be Virtual Turks where, via VR, you'll step into the body of a robot when it's faced with a difficult task, complete the task, and then that recorded experience will be used to train future models. It's sad, because under socialism there's an incredible potential for building a society where AI/Robots and humanity live in symbiosis akin to something like The Culture, but it's just gonna be another cyber dystopia panopticon.

[-] context@hexbear.net 47 points 1 year ago

Intelligence is largely about interactions between Context and Data

me solidarity data-outdoor-cat

intelligence

[-] JoeByeThen@hexbear.net 19 points 1 year ago

data-laughing

[-] QuillcrestFalconer@hexbear.net 24 points 1 year ago

Eventually researchers are going to realize (if they haven't already) that there's massive amounts of untapped Data being unrecorded in virtual experiences.

They already have. A lot of robots are already training using simulated environments, and nvidia is developing frameworks to help accelerate this. Also this is how things like alpha go were trained, with self-play, and these reinforcement learning algorithms will probably be extended for LLMs.

Also like you said there's a lot of still untapped data in audio / video and that's starting to be incorporated into the models.

[-] JoeByeThen@hexbear.net 16 points 1 year ago

Yeah, I'm familiar with a bunch of autonomous vehicles/drones being trained in simulated environments, but I'm also thinking stuff like VRChat.

[-] reddit@hexbear.net 6 points 1 year ago

My one quibble: that's not the future of gig work, it's the present

[-] JoeByeThen@hexbear.net 6 points 1 year ago

It's been a few years since I've used mturk, but there were very few VR based jobs when I last used it. Has that changed?

[-] reddit@hexbear.net 3 points 1 year ago

Ah sorry, I was just being a smartass, no idea how much VR is on mturk now. To be clear I think you've got an accurately bleak view of the future of this stuff

load more comments (1 replies)

[-] HexReplyBot@hexbear.net 3 points 1 year ago

I found a YouTube link in your comment. Here are links to the same video on alternative frontends that protect your privacy:

[-] peppersky@hexbear.net 41 points 1 year ago

"our artificial intelligence has read every book in the world and is still dumb as shit"

[-] Flyberius@hexbear.net 35 points 1 year ago

Please crash already. I need some an "All my models, ruined" moment from these fools.

[-] lurkerlady@hexbear.net 35 points 1 year ago* (last edited 1 year ago)

This is accurate, though I am actually going to explain why. These big model companies (Google, ClosedAI, etc) parasitize the open-weights/open-source community that actually makes good Loras, fine tunes, and research papers. Consumer hardware simply hasn't gotten good and cheap enough for very good fine tune training, and thats why this is all slowly petering out. In a couple of generations of consumer GPUs, which will be when we get consumer GPUs geared towards AI (re: super high VRAM counts of like 70gb+ for an affordable sub 700 usd cost), we might see another leap forward in this tech. Though I will say that this mostly pertains to LLMs, generative AI models like Stable Diffusion have a lot of tricks up their sleeves that can still be explored. Most of recent research and tweaking has been based around building a structure for the AI to build on, to sort of guide it rather than letting it take random stabs at things, in order to improve outputs. Some people have been doing things like hard coding color theory, framing a photograph, etc, and interpreting human language to trigger that hard code.

We've had statistical models like these since the 50s. Consumer hardware has always been the big materialist bottleneck, this is all powered by small research teams and hobbyist nerds. You can throw a ton of money at it and have a giant research team, but the performance you squeeze out of adding 400b more parameters to your 13b model or having a gigantic locked-down datacenter is going to be diminishing.

Also, synthetic data can be useful, people are hating on it in this thread but its a great way to reinforce good habits in the AI and interpret garbled code and speech that would otherwise confuse the AI. I sometimes feel like people just see something about 'AI bad' and upvote it and don't try to understand it, where it is useful and where it is not, and so on.

[-] bazingabrain@hexbear.net 13 points 1 year ago

I fail to see how synthetic data is good if it makes AI used to justify job cuts, "better".

[-] frauddogg@lemmygrad.ml 13 points 1 year ago* (last edited 1 year ago)

That's where I'm at. Sure, there might be moderately-beneficial use-cases, maybe; but it doesn't change the fact that there's no such thing as an ethically-trained model, and there's still no such thing as a model that wasn't created based on rampant theft by capitalists, so I consider anything that comes of it fruit of the poison tree.

AI bad until the base that comprises it radically changes, across the board.

[-] lurkerlady@hexbear.net 11 points 1 year ago* (last edited 1 year ago)

Sure, there might be moderately-beneficial use-cases, maybe; but it doesn't change the fact that there's no such thing as an ethically-trained model, and there's still no such thing as a model that wasn't created based on rampant theft by capitalists, so I consider anything that comes of it fruit of the poison tree.

I mean thats just the case with everything really. Theres a lot of very good use cases that are mostly to do with data manipulation, but the coolest ones are translating. I think we're approaching a point where small models are providing very accurate translations and are even translating tone and intent properly, which is far superior to simple dictionary translation methods. I think its very possible that new phones could be outfitted with tensor cores and you could have a real-time universal translator in your hand, though it'll likely only add 'subtitles' irl for you. AI voice-word recognition has also been very good and can be miniaturized. This is the use case I'm most excited for, personally, as a communist. Currently translating in a foreign country requires a lot of typing (if you dont have a perfect grasp of language) and it removes a very human element I feel to conversation. If everyone could locally run a subtitle-translation generation app it'd be amazing for all of humanity.

Theres of course plenty of manufacturing use cases as well, but China is spearheading on that, though there is some work being done in the US as well in the few industries that remain.

[-] bazingabrain@hexbear.net 10 points 1 year ago

AI bad until the base that comprises it radically changes, across the board.

which wont happen, hence why me and 650k others moved to cara and gave meta the finger.

[-] lurkerlady@hexbear.net 9 points 1 year ago* (last edited 1 year ago)

Synthetic data is basically a fancy way of saying 'I'm properly formatting data and reinforcing the ai's good outputs'. Rearranging words, fixing / adding tags, that sort of thing. This is generated with various tools that usually have an LLM or VLM plugged in, though some are as simple as a regex script.

[-] Infamousblt@hexbear.net 34 points 1 year ago

Because it was never actually intelligent. Calling it AI was just a buzzword

[-] xj9@hexbear.net 33 points 1 year ago

wait are you telling me that the AI revolution was extremely oversold???

[-] D61@hexbear.net 21 points 1 year ago

AI Revoluti-off kelly

[-] davel@hexbear.net 30 points 1 year ago* (last edited 1 year ago)

Spicy autocomplete can produce much more content much faster than we can, and it is consuming its own content now. What could go wrong?

clown-to-clown-communication clown-to-clown-conversation

[-] kleeon@hexbear.net 29 points 1 year ago

this is exactly what halted machine learning research back in the day - there was just not enough data out there to train these models

[-] DragonBallZinn@hexbear.net 28 points 1 year ago* (last edited 1 year ago)

Based. Fuck AI.

Always suspicious when its one of the few technologies boomers got super hyped up about and wanted to shove into everything.

[-] Owl@hexbear.net 22 points 1 year ago

This entire boom was predicated on being able to throw 10x the compute budget at a problem and get 2x the quality of results, so it was inevitable. It's not like big tech is suddenly funding long-term R&D teams again; they stopped doing that before most of these companies were even founded.

[-] Assian_Candor@hexbear.net 22 points 1 year ago

It would be funny if we hadn't incinerated the planet for this shit. The peddlers will get rich too, zero consequences, except of course for the jobs that were snuffed out in infancy.

[-] aaro@hexbear.net 15 points 1 year ago

reposting my hot AI take

Just because capital can't possibly imagine more than 5 minutes in the future, and just because capital can only speak profit and couldn't fathom progress for the sake of progress, doesn't mean that AI isn't real and scary. The technological hurdles are similar things that have been overcome in past technologies, the incentive to replace workers with machines is just as enticing as it's ever been, and if we've seen noise and fervor like this now with this little of the total reward reaped, expect to continue to see this much noise and fervor until the last drop of blood has been squeezed out.

[-] D61@hexbear.net 14 points 1 year ago

The more social media style posts/comments I read about this "AI" stuff, the more I realize I've been doing the same thing since I was in middle school.

I was reading way above my grade level and would use words (often incorrectly) that I wasn't expected to know with such confidence that adults thought I was smart.

[-] BobDole@hexbear.net 13 points 1 year ago

Looks like we’re right on track for another AI Winter

[-] VILenin@hexbear.net 12 points 1 year ago

Time to start blaming wamen and menorites

[-] Evilphd666@hexbear.net 5 points 1 year ago

We could unleash it's full potential if we don't handuff it. We have to make it safe for all advertisers. Can't have it tell users capitallism is the problem. People might start revolting! porky-scared

[-] sharedburdens@hexbear.net 12 points 1 year ago

shocked-pikachu

[-] Vampire@hexbear.net 11 points 1 year ago

Large Language Models are approaching the limits of their intelligence

AI is not synonymous with LLMs

To get smarter, they'll have to be merged with other AI techniques.

[-] MaxOS@hexbear.net 9 points 1 year ago

all-my-apes-gone

[-] Evilphd666@hexbear.net 8 points 1 year ago

If AI can't generate new and improved information then maybe the I part is a bit disingenuous. Not able to take in new information and make informed decisions. It's a fancy president-parrot-naked

[-] ksynwa@lemmygrad.ml 8 points 1 year ago

Seems like the article is talking specifically about LLMs. I think there is already enough data to train the technology that tech companies purport LLMs to be. They will have to develop the technology further, which I don't know is something that AI companies are focusing on. It seems like there is only so much one can with training models to learn what words means only in relation to other words.

[-] iridaniotter@hexbear.net 6 points 1 year ago

[-] tamagotchicowboy@hexbear.net 5 points 1 year ago

This is why AI is a paper tiger, that and climate change and its handling under capitalism.

this post was submitted on 11 Jun 2024

94 points (100.0% liked)

technology

23872 readers

91 users here now

On the road to fully automated luxury gay space communism.

Spreading Linux propaganda since 2020

Rules:

1. Obviously abide by the sitewide code of conduct. Bigotry will be met with an immediate ban
2. This community is about technology. Offtopic is permitted as long as it is kept in the comment sections
3. Although this is not /c/libre, FOSS related posting is tolerated, and even welcome in the case of effort posts
4. We believe technology should be liberating. As such, avoid promoting proprietary and/or bourgeois technology
5. Explanatory posts to correct the potential mistakes a comrade made in a post of their own are allowed, as long as they remain respectful
6. No crypto (Bitcoin, NFT, etc.) speculation, unless it is purely informative and not too cringe
7. Absolutely no tech bro shit. If you have a good opinion of Silicon Valley billionaires please manifest yourself so we can ban you.

founded 5 years ago

MODERATORS

context@hexbear.net

EmmaGoldman@hexbear.net

SexUnderSocialism@hexbear.net

gaycomputeruser@hexbear.net

ZoomeristLeninist@hexbear.net