903
top 50 comments
sorted by: hot top controversial new old
[-] NegativeLookBehind@lemmy.world 86 points 4 months ago* (last edited 4 months ago)

I found your email address:

(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])
[-] exu@feditown.com 17 points 4 months ago

I was about to ruin your day by finding a valid email address that would be rejected by your regex, but it doesn't even parse correctly on regex101.com

The only valid regex for email is .+@.+ btw

[-] NegativeLookBehind@lemmy.world 10 points 4 months ago* (last edited 4 months ago)

It's the RFC standard regex for email, so it's definitely valid, btw. My copy/paste just seems to add a bunch of backslashes for some reason

[-] frezik@midwest.social 5 points 4 months ago

The argument here is that checking complex validation is a fool's errand. Yes, you can write a fully validating regex for RFC email. In fact, it should be possible to write a regex shorter than the one that gets passed around since the 90s, because regular expression engines support recursive patterns now. (Part of the reason that old regex is so complicated is because email allows nested comments (which is insane (how insane? (Lisp levels insane)))).

However, it doesn't get you much of anywhere. What you really want to know is if it's a valid email or not, and the only way to do that is to send an email to that address with a confirmation. The only point of the regex is to throw away obviously bad addresses. For that, checking that there's an @ symbol and something for the user and domain portions is sufficient. I'd add needing a dot in the domain portion, but it's not that important.

Classically, it was argued that emails don't even need a domain portion when things are done for internal systems, or that internal domains don't need a tld. In my personal experience, this is rarely done anymore and can be safely ignored. Maybe some very, very old legacy systems, and if you're working on one of those, then sure. For everyone else, don't worry about it. You're probably working on publicly accessible systems, and even if you're not, most users are going to prefer using their fully spec'd out email address, anyway.

load more comments (3 replies)
[-] exu@feditown.com 4 points 4 months ago

Where did you get that "RFC standard" regex? It doesn't allow domain names with one component RFC5321

Neither does it allow spaces in quoted string, as per RFC5322

This, 👋@✉️.gg, is already a working email address in most clients and if RFC6532 ever gets accepted, it would be officially recognized as such.

My point isn't to make your regex bad, just that it doesn't validate or invalidate an email properly. Nothing stops me from giving you and invalid but syntactically correct email after all.
You have to send an email anyways to verify, so the most you can check is the presence of one @ symbol.

What about "user@not_domain"? It validates but isn't valid - there's no domain part, the @ is quoted

[-] exu@feditown.com 4 points 4 months ago

That's not something you can determine using a regex.

"user@com" for example could be a perfectly working email.

The right way is to send a verification email in every case.

load more comments (2 replies)
[-] palordrolap@fedia.io 10 points 4 months ago

That \\. part doesn't look right, but what do I know. Apparently control codes are valid elsewhere, so a literal backslash followed by any character, even a space or a newline, might actually be valid there.

"Yeah, my e-mail address is abc, carriage return, three backspaces and a terminal bell at example dot com. ... What do you mean your mail program doesn't support it?"

[-] frezik@midwest.social 32 points 4 months ago

It helps if you break it apart into its component parts. Which is like anything else, really, but we've all accepted that regexes are supposed to run together in an unreadable mess. No reason it has to be that way.

[-] marcos@lemmy.world 17 points 4 months ago

If they are Perl regexes, like all regexes are supposed to be, you can have non-semantic whitespace and comments.

But if you are using some system that enforces something different, you are out of luck.

[-] frezik@midwest.social 6 points 4 months ago

Not necessarily. For just debugging purposes, you can still break them up to help understand them. Even ignoring that, there are options in languages that don't implement /x.

https://wumpus-cave.net/post/2022/06/2022-06-06-how-to-write-regexes-that-are-almost-readable/index.html

At my company we store our regex in the database with linebreaks in it, but when it's actually called to be used those line breaks are stripped out. That way regex that looks for X can all be all on one line and actually readable.

[-] tyler@programming.dev 10 points 4 months ago

wait... why do you have so many regexes you need to put them in a database???

[-] wise_pancake@lemmy.ca 5 points 4 months ago

The comments flag needs more support.

[-] Kng@feddit.rocks 27 points 4 months ago

I have found chatgpt to be very good at writing regex. I also don't know how to write regex.

[-] Mirodir@discuss.tchncs.de 16 points 4 months ago

In my experience, it is good at simple to medium complexity regex. For the harder ones it starts being quite useless though, at best providing a decent starting point to begin debugging from.

[-] SwordInStone@lemmy.world 6 points 4 months ago

well, you won't get better using chatgpt for it

[-] AngryClosetMonkey@feddit.nl 24 points 4 months ago

Just pop them into regex101 or a similar tool, add sample data, see the mistake, fix the mistake, continue to do other stuff.

[-] WhyJiffie@sh.itjust.works 4 points 4 months ago

Just pop them into regex101 or a similar tool, add sample data, ~~see the mistake, fix the mistake, continue to do other stuff.~~ it works there, pull hair

FTFY

[-] exu@feditown.com 22 points 4 months ago

I usually do

# What we are doing (high level)
# Why we need regex
# Regex step by step
# Examples of matches
regex

And I still rewrite it the next time

[-] InnerScientist@lemmy.world 18 points 4 months ago* (last edited 4 months ago)
// abandon all hope ye who commit here
(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])

Edith: damit, Not the first to post this abomination

[-] marine_mustang@sh.itjust.works 21 points 4 months ago
[-] gofsckyourself@lemmy.world 31 points 4 months ago
[-] echindod@programming.dev 4 points 4 months ago

This is the one I use! Might have to look at regexer though

[-] AI_toothbrush@lemmy.zip 15 points 4 months ago

Me checking my own docs: "this is some voodoo shit, idk how it works"

[-] muntedcrocodile@lemm.ee 3 points 4 months ago

That's what my comments say

[-] MTK@lemmy.world 12 points 4 months ago

Downvoted so that everyone can know I'm cool since I understand regex better than the idiot who made that meme.

[-] Skyrmir@lemmy.world 12 points 4 months ago

Never debug regex, just generate a new one. It's not worth the hassle to figure out not only what it does, but what it was meant to do.

Better yet, just write it out in code, and never use regex. Tis a stupid thing that never should have been made.

[-] potustheplant@feddit.nl 17 points 4 months ago* (last edited 4 months ago)

Hard disagree. The function regex serves in programs like Notepad++ can't be easily replaced by "writing it out in code". With a very small number of characters you can get complex search patterns and capturing groups. It's hard to read but incredibly useful.

[-] kunaltyagi@programming.dev 7 points 4 months ago* (last edited 4 months ago)

Can't upvote twice, have a low effort comment instead

load more comments (12 replies)
[-] sakodak@lemmy.world 4 points 4 months ago

Regex is a write only language.

load more comments (1 replies)
[-] fmstrat@lemmy.nowsci.com 9 points 4 months ago

I know I'm weird, but I love regex.

[-] verstra@programming.dev 9 points 4 months ago* (last edited 4 months ago)

If I have a complex regular expression to code into my app, I write it in pomsky, then copy paste the compiled regex to my source file, but also keep the pomsky source nearby. Much more maintainable.

[-] over_clox@lemmy.world 8 points 4 months ago* (last edited 4 months ago)

This is basically code refactoring on a simplified level. You're basically renaming a whole bunch of functions/tokens at once.

Let's say you're renaming the variable 'count' under the method 'buttplug'. First off, what do you rename it to?

You start by replacing every instance of buttplug.count with a unique token, let's say tnuoc.gulpttub.

Then you replace that buttplug with a unique buttplug.

Simple.

[-] Trail@lemmy.world 4 points 4 months ago

Then you replace that buttplug with a unique buttplug.

Rare buttplugs with good affixes are better than unique buttplugs.

[-] vk6flab@lemmy.radio 7 points 4 months ago

There are a few online regex testing tools that will analyse your efforts and give you the opportunity to provide sample data.

[-] LovableSidekick@lemmy.world 7 points 4 months ago

LOL yeah that's about right.

[-] DrDeadCrash@programming.dev 7 points 4 months ago

Aziz! LIGHT!

[-] psmgx@lemmy.world 6 points 4 months ago

Ohhhhh it was this extra '

[-] kamen@lemmy.world 4 points 4 months ago

There are no bugs, it's just not doing what you expect it to be doing...

... which, now that I think of it, can be said about all software in general.

[-] Prime_Minister_Keyes@lemm.ee 3 points 4 months ago
load more comments
view more: next ›
this post was submitted on 14 Feb 2025
903 points (98.4% liked)

Programmer Humor

24650 readers
74 users here now

Welcome to Programmer Humor!

This is a place where you can post jokes, memes, humor, etc. related to programming!

For sharing awful code theres also Programming Horror.

Rules

founded 2 years ago
MODERATORS