225
you are viewing a single comment's thread
view the rest of the comments
[-] andrew_bidlaw@sh.itjust.works 13 points 5 days ago

As it learns from our data, no wonder it fucks up at regexps. They are the arcane knowledge not accessible to us mere mortals, nor to LLMs.

[-] ryathal@sh.itjust.works 9 points 5 days ago

If you know even a little about how an LLM works it's obvious why regex is basically impossible for it. I suspect perl has similar problems, but no one is capable of actually validating that.

[-] ignotum@lemmy.world 2 points 5 days ago

What do you mean it's impossible for it? I know how LLMs work but I don't know if any such limitations

Write me a regex that matches a letter repeated four times, followed by a 3 or 4 digit number

Here’s your regex: ([a-zA-Z])\1{3}\d{3,4}

[-] ryathal@sh.itjust.works 4 points 5 days ago

They aren't context aware, it's using statistical probability. It can replicate things it's seen a lot of like a tutorial regex. It can't apply that to make a more complicated one. Regex in the wild isn't really standard at all, because it's rarely used to solve common problems. It has a bunch of random regexs from code it analyzed and will spit something out that looks similar.

this post was submitted on 18 Dec 2024
225 points (98.3% liked)

Programmer Humor

19817 readers
417 users here now

Welcome to Programmer Humor!

This is a place where you can post jokes, memes, humor, etc. related to programming!

For sharing awful code theres also Programming Horror.

Rules

founded 2 years ago
MODERATORS