903
you are viewing a single comment's thread
view the rest of the comments
[-] NegativeLookBehind@lemmy.world 86 points 5 months ago* (last edited 5 months ago)

I found your email address:

(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])
[-] exu@feditown.com 17 points 5 months ago

I was about to ruin your day by finding a valid email address that would be rejected by your regex, but it doesn't even parse correctly on regex101.com

The only valid regex for email is .+@.+ btw

[-] NegativeLookBehind@lemmy.world 10 points 5 months ago* (last edited 5 months ago)

It's the RFC standard regex for email, so it's definitely valid, btw. My copy/paste just seems to add a bunch of backslashes for some reason

[-] frezik@midwest.social 5 points 5 months ago

The argument here is that checking complex validation is a fool's errand. Yes, you can write a fully validating regex for RFC email. In fact, it should be possible to write a regex shorter than the one that gets passed around since the 90s, because regular expression engines support recursive patterns now. (Part of the reason that old regex is so complicated is because email allows nested comments (which is insane (how insane? (Lisp levels insane)))).

However, it doesn't get you much of anywhere. What you really want to know is if it's a valid email or not, and the only way to do that is to send an email to that address with a confirmation. The only point of the regex is to throw away obviously bad addresses. For that, checking that there's an @ symbol and something for the user and domain portions is sufficient. I'd add needing a dot in the domain portion, but it's not that important.

Classically, it was argued that emails don't even need a domain portion when things are done for internal systems, or that internal domains don't need a tld. In my personal experience, this is rarely done anymore and can be safely ignored. Maybe some very, very old legacy systems, and if you're working on one of those, then sure. For everyone else, don't worry about it. You're probably working on publicly accessible systems, and even if you're not, most users are going to prefer using their fully spec'd out email address, anyway.

[-] NegativeLookBehind@lemmy.world 1 points 5 months ago

Cool story bro, go argue with the IETF

[-] frezik@midwest.social 2 points 5 months ago

Why? Do you think nested comments are a good idea?

[-] nickwitha_k@lemmy.sdf.org 1 points 5 months ago* (last edited 5 months ago)

Deleted by user.

[-] exu@feditown.com 4 points 5 months ago

Where did you get that "RFC standard" regex? It doesn't allow domain names with one component RFC5321

Neither does it allow spaces in quoted string, as per RFC5322

This, ๐Ÿ‘‹@โœ‰๏ธ.gg, is already a working email address in most clients and if RFC6532 ever gets accepted, it would be officially recognized as such.

My point isn't to make your regex bad, just that it doesn't validate or invalidate an email properly. Nothing stops me from giving you and invalid but syntactically correct email after all.
You have to send an email anyways to verify, so the most you can check is the presence of one @ symbol.

What about "user@not_domain"? It validates but isn't valid - there's no domain part, the @ is quoted

[-] exu@feditown.com 4 points 5 months ago

That's not something you can determine using a regex.

"user@com" for example could be a perfectly working email.

The right way is to send a verification email in every case.

[-] FreshLight@sh.itjust.works 2 points 5 months ago

Does this regex include

"very.(),:;<>[]\".VERY.\"very@\\ \"very\".unusual"@strange.example.com

[-] palordrolap@fedia.io 10 points 5 months ago

That \\. part doesn't look right, but what do I know. Apparently control codes are valid elsewhere, so a literal backslash followed by any character, even a space or a newline, might actually be valid there.

"Yeah, my e-mail address is abc, carriage return, three backspaces and a terminal bell at example dot com. ... What do you mean your mail program doesn't support it?"

this post was submitted on 14 Feb 2025
903 points (98.4% liked)

Programmer Humor

25448 readers
957 users here now

Welcome to Programmer Humor!

This is a place where you can post jokes, memes, humor, etc. related to programming!

For sharing awful code theres also Programming Horror.

Rules

founded 2 years ago
MODERATORS