806
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
this post was submitted on 18 Jul 2024
806 points (99.5% liked)
Technology
59583 readers
2398 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
I'm still not sold that dynamic text generation is going to be the major near-term application for LLMs, much less in games. Like, don't get me wrong, it's impressive what they've done. But I've also found it to be the least-practically-useful of the LLM model categories. Like, you can make real, honest-to-God solid usable graphics with Stable Diffusion. You can do pretty impressive speech generation in TortoiseTTS. I imagine that someone will make a locally-runnable music LLM model and software at some point if they haven't yet; I'm pretty impressed with what the online services do there. I think that there are a lot of neat applications for image recognition; the other day I wanted to identify a tree and seedpod. Someone hasn't built software to do that yet (that I'm aware of), but I'm sure that they will; the ability to map images back to text is pretty impressive. I'm also amazed by the AI image upscaling that Stable Diffusion can do, and I suspect that there's still room for a lot of improvement there, as that's not the main goal of Stable Diffusion. And once someone has done a good job of building a bunch of annotated 3d models, I think that there's a whole new world of 3d.
I will bet that before we see that becoming the norm in games, we'll see LLMs regularly used for either pre-generated speech synth or in-game speech synthesis, so that the characters say text (which might be procedurally-generated, aren't just static pre-recorded samples, but aren't necessarily generated from an LLM). Like, it's not practical to have a human voice actor cover all possible phrases with static recorded speech that one might want an in-game character to speak.
there are some local genai music models, although I don't know how good they are yet as I haven't tried any myself (stable audio is one, but I'm sure there are others)
also minor linguistic nitpick but LLM stands for 'language model' (you could maybe get away with it for pixart and sd3 as they use t5 for prompt encoding, which is an llm, i'm sure some audio models with lyrics use them too), the term you're looking for is probably 'generative'
I think it's coming pretty fast. There's already a mod for Skyrim that lets you talk to your companion. People are spending hours talking to llms and roleplaying, the first triple A game to incorporate it is going to bee a massive hit imo. I'm actually surprised no one's been coming out with visual novels using them, it seems like a perfect use case.
It's definitely going to be used first for making the content of the game like you said though.