417
OpenAI introduces Sora, its text-to-video AI model
(www.theverge.com)
This is a most excellent place for technology news and articles.
That remains to be seen. We have yet to see one of these things actually get good at anything, so we don’t know how hard that last part is to do. I don’t think we can assume there will be continuous linear progress. Maybe it’ll take one year, maybe it’ll take 10, maybe it’ll just never reach that point.
Yeah a real problem here is how you get an AI which doesn't understand what it is doing to create something complete and still coherent. These clips are cool and all, and so are the tiny essays put out by LLMs, but what you see is literally all you are getting; there are no thoughts, ideas or abstract concepts underlying any of it. There is no meaning or narrative to be found which connects one scene or paragraph to another. It's a puzzle laid out by an idiot following generic instructions.
That which created the woman walking down that street doesn't know what either of those things are, and so it can simply not use those concepts to create a coherent narrative. That job still falls onto the human instructing the AI, and nothing suggests that we are anywhere close to replacing that human glue.
Current AI can not conceptualise -- much less realise -- ideas, and so they can not be creative or create art by any sensible definition. That isn't to say that what is produced using AI can't be posed as, mistaken for, or used to make art. I'd like to see more of that last part and less of the former two, personally.
I kinda 100% agree with you on the art part since it can't understand what it's doing... On the other hand, I could swear that if you look at some generated AI imagines it's kind of mocking us. It's a reflection of our society in a weird mirror. Like a completely mad or autistic artist that is creating interesting imagery but has no clue what it means. Of course that exists only in my perception.
But it the sense of "inventive" or "imaginative" or "fertile" I find AI images absolutely creative. As such it's telling us something about the nature of creative process, about the "limits" of human creativity - which is in itself art.
When you sit there thinking up or refining prompts you're basically outsourcing the imaginative visualizing part of your brain. An "AI artist" might not be able draw well or even have the imagination, but he might have a purpose or meaning that he's trying to visualize with the help of AI. So AI generation is at least some portion of the artistic or creative process but not all of it.
Imagine we could have a brain computer interface that lets us perceive virtual reality like with some extra pair of eyes. It could scan our thoughts and allows us to "write text" with our brain, and then immediately feeds back a visual AI generated stream that we "see". You'd be a kind of creative superman. Seeing / imagining things in their head is of course what many people do their whole life but not in that quantity or breadth. You'd hear a joke and you would not just imagine it, you'd see it visualized in many different ways. Or you'd hear a tragedy and...
Autists usually have no trouble understanding the world around them. Many are just unable to interface with it the way people normally do.
Well yes, it's trained on human output. Cultural biases and shortcomings in our species will be reflected in what such an AI spits out.
We use a lot of devices in our daily lives, whether for creative purposes or practical. Every such device is an extension of ourselves; some supplement our intellectual shortcomings, others physical. That doesn't make the devices capable of doing any of the things we do. We just don't attribute actions or agency to our tools the way we do to living things. Current AI possess no more agency than a keyboard does, and since we don't consider our keyboards to be capable of authoring an essay, I don't think one can reasonably say that current AI is, either.
A keyboard doesn't understand the content of our essay, it's just there to translate physical action into digital signals representing keypresses; likewise, an LLM doesn't understand the content of our essay, it's just translating a small body of text into a statistically related (often larger) body of text. An LLM can't create a story any more than our keyboard can create characters on a screen.
Only once/if ever we observe AI behaviour indicative of agency can we start to use words like "creative" in describing its behaviour. For now (and I suspect for quite some time into the future), all we have is sophisticated statistical random content generators.