view the rest of the comments
Technology
This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.
Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.
Rules:
1: All Lemmy rules apply
2: Do not post low effort posts
3: NEVER post naziped*gore stuff
4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.
5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)
6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist
7: crypto related posts, unless essential, are disallowed
I've been doing a lot of using, testing, and evaluating LLMs and GPT-style models for generating code and text/prose. Some of it is just general use to see how it behaves, some has been explicit evaluation of creative writing, and a bunch of it is code generation to test out how we need to modify our CS curriculum in light of these new tools.
It's an impressive piece of technology, but it's not very creative. It's meh. The results are meh. Which is to be expected since it's a statistical model that's using a large body of prior work to produce a reasonable approximation of what it's seen before. It trends towards the mean, not the best.
This'd explain why inexperienced users of ai would inevitably get mediocre results. Still takes creativity to get stolen mediocrity.
You have to know how to operate the oven to reheat store bought pie. Generative LLMs are machines like ovens, and turning the knobs is not creativity. Not operating the oven correctly gets you Sharon Weiss results.
I guess a protip is you have to tell it explicitly in the prompt who it's supposed to steal from.
For instance, midjourney or SD will produce much better results if you put specific artstation channel names along with 'artstation' in the prompt.
I'm curious if you've gotten anything decent out of them. I've tried to use it for tech/code questions, and it's been nothing but disappointment after disappointment. I've tried to use it to get help with new concepts, but it hallucinates like crazy and always give me bad results, some of the time it's so bad that it gives me answers I've already told it we're wrong.
Yeah, I've just set up a hotkey that says something like "back up your answer with multiple reputable sources" and I just always paste it at the end of everything I ask. If it can't find webpages to show me to back up its claims then I can't trust it. Of course this isn't the case with coding, for that I can actually run the code to verify it.
What version are you using?
GPT-4 is quite impressive, and the dedicated code LLMs like Codex and Copilot are as well. The latter must have had a significant update in the past few months, as it's become wildly better almost overnight. If trying it out, you should really do so in an existing codebase it can use as a context to match style and conventions from. Using a blank context is when you get the least impressive outputs from tools like those.
I've used gpt 3/3.5, bing, bard and copilot, and I'm not super stoked. Copilot gave me PS DSC items that don't actually exist, which was my most recent attempt at using a LLM.
I might see about figuring out if it can hook into my vs code instance so it's a bit smarter at some point.
There's an official plug-in to do this that takes like 15 minutes to set up.
am use for end of year ai project for school
That's where some of the significant advances over the past 12 months of research have been, specifically around using the fine tuning phase to bias towards excellence. The biggest advance there has been that capabilities in larger models seem to be transmissible to smaller models by feeding in output from the larger more complex models.
Also, the process supervision work to enhance CoT from May is pretty nuts.
So while you are correct that the pretrained models come out with a regression towards the mean, there are very promising recent advances in taking that foundation and moving it towards excellence.
I'm excited for how these tools will be used by human creators to accomplish things they could never do alone, and in that aspect it is a revolutionary technology. I hate that their marketing calls it "AI" though, the only intelligence involved is the human user that creates prompts and curates results.