I would think that it would only apply to AI generated images, but I suppose it would depend on the community. In this comm in particular in which all the posts are images it shouldn't be too tricky to define. As the technology advances it might eventually be impossible to spot them, though...
Large Language Models generate human-like text. They operate on words broken up as tokens and predict the next one in a sequence. Image Diffusion models take a static image of noise and iteratively denoise it into a stable image.
The confusion comes from services like OpenAI that take your prompt, dress it up all fancy, and then feed it to a diffusion model.
Dude, it doesn't know what it's looking at. It isn't intelligent. It's just a prediction algorithm called LLMs. It doesn't matter if it's predicting text or pixels. It's all LLMs.
LLMs aren't generating the images, when "using an LLM for image generation" what's actually happening is the LLM talking to an image generation model and then giving you the image.
Ironically there's a hint of truth in it though because for text-to-image generation the model does need to map words into a vector space to understand the prompt, which is also what LLMs do. (And I don't know enough to say whether the image generation offered through LLMs just has the LLM provide the vectors directly to the image gen model rather than providing a prompt text).
You could also consider the whole thing as one entity in which case it's just more generalized generative AI that contains both an LLM and an image gen model.
Is there an accepted definition or is it just all AI generated content?
I saw someone here get hassled about AI slop for posting over sharpened screen shots of Buffy the Vampire Slayer.
I mean, it was sharpened with AI, so it was at least a factor.
But it wasn't AI generated, so it wasn't slop.
Seems fair TBH
Edit: downvotes for agreeing with something that's not downvoted? Dafuq?
I just downvote anyone bitching about downvotes regardless of how i feel about the content of their comment. lol
You think it's fair to get hassled for AI slop because you posted a real screen shot of a TV show?
For slopping all over it, yes.
You aren't agreeing with their comment.
They say it is dumb to get downvoted for posting a picture that wasn't generated by AI.
So it is literally the opposite of your comment.
I would think that it would only apply to AI generated images, but I suppose it would depend on the community. In this comm in particular in which all the posts are images it shouldn't be too tricky to define. As the technology advances it might eventually be impossible to spot them, though...
LLM generated content in general; images, comments, etc.
I'm not sure language models are capable of generating images
No. LLMs are still what generates images.
Large Language Models generate human-like text. They operate on words broken up as tokens and predict the next one in a sequence. Image Diffusion models take a static image of noise and iteratively denoise it into a stable image.
The confusion comes from services like OpenAI that take your prompt, dress it up all fancy, and then feed it to a diffusion model.
You can't use LLMs to generate images.
That is a completely different beast with their own training set.
Just because both are made by machine learning, doesn't mean they are the same.
I think the term you're looking for is "generative AI"
Nope. LLMs are still what's used for image generation. They aren't AI though, so no.
Which part of the image is language?
Dude, it doesn't know what it's looking at. It isn't intelligent. It's just a prediction algorithm called LLMs. It doesn't matter if it's predicting text or pixels. It's all LLMs.
https://botpenguin.com/blogs/comparing-the-best-llms-for-image-generation
You can generate images without ever using any text. By uploading and combining images to create new things.
No LLM will be used in that context.
Holy confidently incorrect
LLMs aren't generating the images, when "using an LLM for image generation" what's actually happening is the LLM talking to an image generation model and then giving you the image.
Ironically there's a hint of truth in it though because for text-to-image generation the model does need to map words into a vector space to understand the prompt, which is also what LLMs do. (And I don't know enough to say whether the image generation offered through LLMs just has the LLM provide the vectors directly to the image gen model rather than providing a prompt text).
You could also consider the whole thing as one entity in which case it's just more generalized generative AI that contains both an LLM and an image gen model.
What do you think LLM stands for?
Large Language Image
Yes.