94

The big AI models are running out of training data (and it turns out most of the training data was produced by fools and the intentionally obtuse), so this might mark the end of rapid model advancement

you are viewing a single comment's thread
view the rest of the comments
[-] bazingabrain@hexbear.net 13 points 5 months ago

I fail to see how synthetic data is good if it makes AI used to justify job cuts, "better".

[-] frauddogg@lemmygrad.ml 13 points 5 months ago* (last edited 5 months ago)

That's where I'm at. Sure, there might be moderately-beneficial use-cases, maybe; but it doesn't change the fact that there's no such thing as an ethically-trained model, and there's still no such thing as a model that wasn't created based on rampant theft by capitalists, so I consider anything that comes of it fruit of the poison tree.

AI bad until the base that comprises it radically changes, across the board.

[-] lurkerlady@hexbear.net 11 points 5 months ago* (last edited 5 months ago)

Sure, there might be moderately-beneficial use-cases, maybe; but it doesn't change the fact that there's no such thing as an ethically-trained model, and there's still no such thing as a model that wasn't created based on rampant theft by capitalists, so I consider anything that comes of it fruit of the poison tree.

I mean thats just the case with everything really. Theres a lot of very good use cases that are mostly to do with data manipulation, but the coolest ones are translating. I think we're approaching a point where small models are providing very accurate translations and are even translating tone and intent properly, which is far superior to simple dictionary translation methods. I think its very possible that new phones could be outfitted with tensor cores and you could have a real-time universal translator in your hand, though it'll likely only add 'subtitles' irl for you. AI voice-word recognition has also been very good and can be miniaturized. This is the use case I'm most excited for, personally, as a communist. Currently translating in a foreign country requires a lot of typing (if you dont have a perfect grasp of language) and it removes a very human element I feel to conversation. If everyone could locally run a subtitle-translation generation app it'd be amazing for all of humanity.

Theres of course plenty of manufacturing use cases as well, but China is spearheading on that, though there is some work being done in the US as well in the few industries that remain.

[-] bazingabrain@hexbear.net 10 points 5 months ago

AI bad until the base that comprises it radically changes, across the board.

which wont happen, hence why me and 650k others moved to cara and gave meta the finger.

[-] lurkerlady@hexbear.net 9 points 5 months ago* (last edited 5 months ago)

Synthetic data is basically a fancy way of saying 'I'm properly formatting data and reinforcing the ai's good outputs'. Rearranging words, fixing / adding tags, that sort of thing. This is generated with various tools that usually have an LLM or VLM plugged in, though some are as simple as a regex script.

this post was submitted on 11 Jun 2024
94 points (100.0% liked)

technology

23313 readers
469 users here now

On the road to fully automated luxury gay space communism.

Spreading Linux propaganda since 2020

Rules:

founded 4 years ago
MODERATORS