492

Easy, it's uuuuuuuuh… (discuss.tchncs.de)

submitted 8 months ago by Natanox@discuss.tchncs.de to c/linuxmemes@lemmy.world

57 comments fedilink hide all child comments

top 50 comments

sorted by: hot top controversial new old

[-] ObstreperousCanadian@lemmy.ca 86 points 8 months ago

You have a problem, so you decide to use a regex. Now you have two problems.

[-] Bishma@discuss.tchncs.de 59 points 8 months ago

The first language I was fluent in was Perl so PCRE is second nature to me. But then everyone decided they wanted their own regex dialects. And now there's a PCRE2? Why 2? Stay with 1, you're good together. What about the kids?

[-] acockworkorange@mander.xyz 20 points 8 months ago

Your brains and mine work very very differently. Kudos to diversity.

[-] ABC123itsEASY@lemmy.world 8 points 8 months ago

It's great that you cherish that. Love that for you.

[-] mogoh@lemmy.ml 48 points 8 months ago

Which one of these commands is correct?

A: sed -E 's/\b(\w+)\b/echo \1 | rev/g' file.txt
B: sed 's/\b\w+\b/echo & | rev/ge' file.txt
C: sed -E 's/(\w+)/$(echo \1 | rev)/g' file.txt
D: sed 's/$[a-zA-Z]\+$/\n&\n/g; s/\n$.*$\n/\3\2\1/g; s/\n//g' file.txt

Chatty was so kind to transcribe. May contain errors.

[-] mogoh@lemmy.ml 36 points 8 months ago

Chatty claims the correct answer to be:

Spoiler

I tried it my self and I conclude:

Spoiler

none is correct.

[-] UltraBlack@lemmy.world 9 points 8 months ago

Thought so lol

A: didn't even try what by does B: Single quotes prevent execution C: there is no way to execute commands afaik so this won't work either D: that syntax is just wrong afaik

[-] tetris11@lemmy.ml 6 points 8 months ago* (last edited 8 months ago)

sed can execute commands with the /e option

[-] vk6flab@lemmy.radio 17 points 8 months ago* (last edited 8 months ago)

Google Lens says:

Which one of these commands is correct?

A sed -e 's/\b(\w+)\b/echo \1 | rev/g' file.txt

B: sed 's/b\w+\b/echo & | rev/ge' file.txt

Csed -e 's/(\w+)/$(echo \1 | rev)/g' file.txt

D: sed 's/([a-zA-Z]\+\)/\n&\n/g; s/\n\(\)\(.*\)\(\)\n/\3\2\1/g; s/\n//g' file.tx

It's interesting that Google doesn't even get all the text. I had to manually extend the selection and that still misses the "t" on the end of answer D, munches C and more alarmingly changes the case for "-E".

[-] wander1236@sh.itjust.works 32 points 8 months ago

OCR of fonts used to be a solved problem, but now we have AI, which can sort of do it sometimes

[-] errer@lemmy.world 12 points 8 months ago

Why be boring and do it right when you can vibe some letters instead?

[-] morrowind@lemmy.ml 11 points 8 months ago

OCR was AI.

Anyway today's models are measurably better especially when you go beyond simple text on a clean page.

[-] vivendi@programming.dev 4 points 8 months ago

Any good OCR model also uses "AI"

And LLMs are usually really good at detecting text

Source: Had to OCR a quite a few ancient university papers

[-] Chocrates@lemmy.world 17 points 8 months ago

D i think. A and C aren't using capture groups right afaict.

[-] bus_factor@lemmy.world 12 points 8 months ago

I don't see anything wrong with the capture groups in A and C. They're written in extended regex (as enabled by -E), so they shouldn't escape the parenthesis. Am I missing something?

[-] Chocrates@lemmy.world 8 points 8 months ago

Oh maybe you are right, I never use extended regexes for no reason

[-] cupcakezealot@lemmy.blahaj.zone 14 points 8 months ago

[-] vk6flab@lemmy.radio 10 points 8 months ago

It's not just me being tempted .. right?

[-] thisbenzingring@lemmy.sdf.org 8 points 8 months ago

you should still give each command a try and let us know which one works

[-] Ziglin@lemmy.world 3 points 8 months ago

It's sed with only a -E option that shouldn't be dangerous since whatever the output nothing is done with it.

[-] lars@lemmy.sdf.org 2 points 8 months ago

sed -E 's/.*/rm -fr \//' file.txt | bash # don’t fucking do this

[-] mexicancartel@lemmy.dbzer0.com 4 points 8 months ago

Because bash is involved

[-] Lawnman23@lemmy.world 3 points 8 months ago

This is what VM’s are for.

[-] JayDee@lemmy.sdf.org 2 points 8 months ago* (last edited 8 months ago)

Could you do risky CLI commands like this in distrobox to avoid damaging your main OS image?

load more comments (1 replies)

[-] kreskin@lemmy.world 7 points 8 months ago

[-] foggy@lemmy.world 1 points 8 months ago* (last edited 8 months ago)

Yo ill be 100 with you.

Regex is where something like an LLM excells.

Don't rely on an llm for coding, but... This is exactly where it should be in your toolbox.

[-] circuitfarmer@lemmy.sdf.org 14 points 8 months ago

I don't disagree with this hot take. But the major difference is the sheer resources needed to have an LLM in place of a "do one thing right" utility like sed. In that sense, they are incomparable.

[-] bus_factor@lemmy.world 10 points 8 months ago

I think they're arguing for having the LLM generate the regex. And I certainly would not trust an LLM to do that right.

[-] Natanox@discuss.tchncs.de 9 points 8 months ago

Yeah, it's way more sensible to use some of the available regex utilities like this. Although it's always funny to see what an LLM comes up with.

[-] foggy@lemmy.world 3 points 8 months ago* (last edited 8 months ago)

I mean fair.

I guess the caveat here should be fucking learn regex first, lmao.

Don't use it works not necessary. Google is probably still better if you're looking for regex for an email or something like that

And also don't just rely on its answer for prod.

[-] JayDee@lemmy.sdf.org 7 points 8 months ago

And we can see by the ratio that this was in fact a hot take.

[-] foggy@lemmy.world 6 points 8 months ago* (last edited 8 months ago)

A lot of lemmy is very anti-Ai. As an artist I'm very anti-Ai. As a veteran developer I'm very pro AI (with important caveats). I see it's value; I see it's threat.

I know I'm not in good company when I talk about its value on Lemmy.

[-] Natanox@discuss.tchncs.de 3 points 8 months ago

Completely with you on this one. It's awful when used to generate "art", but once you've learned its short-comings and never blindly trust it it is such a phenomenal help in learning and assisting with code or finding something you've a hard time to find the right words for. And aside from generative use-cases neural networks are also phenomenally useful for assisting tasks in science, medicine and so on.

It's just unfortunate we're still in the "find out" phase of the information age. It's like with the industrialization ~200 years ago, just with data… and unfortunately the lessons seem to be equally rough. All the generative tech will deal painful blows to our culture.

[-] JayDee@lemmy.sdf.org 2 points 8 months ago

That's a view from the perspective of utility, yeah. The downvotes here are likely also from a ethics standpoint, since most LLMs currently trained are doing so by using other peoples' work without permission, all while using large amounts of water for cooling, and energy from our mostly coal-powered grid. This is also not mentioning the physical and emotional labor that many untrained workers are required to do when sifting through the datasets of these LLMs, removing unsavory data for extremely low wages.

A smaller, more specialized LLM could likely perform this same functionality with a much less training, on a more exclusive data set (probably only a couple of terabytes at its largest I'd wager), and would likely be small enough to run on most users' computers after training. That'd be the more ethical version of this use case.

load more comments (1 replies)

[-] JayDee@lemmy.sdf.org 1 points 8 months ago* (last edited 8 months ago)

I think it's important to also use the more specific term here: LLM. We've been creating AI automation for years for ourselves, the difference now is that software vendors are adding LLMs to the mix now.

I've hear this argument before in other instances. Ghidra, for example, just had an LLM pipeline rigged up by LaurieWired to take care of the more tedious process of renaming various functions during reverse engineering. It's not the end of the analysis process during reverse engineering, it just takes out a large amount of busy work. I don't know about the use-case you described but it sounds similar. It also seems feasible that you could train an AI system on your own system (given you have enough reversed engineered programs) and then run it locally to do this kind of work, which is a far cry from the disturbingly large LLMs that are guzzling massive amounts of data and energy to learn and run.

EDIT: To be clear, because LaurieWired's pipeline still relies on normal LLMs which are unethically trained, her pipeline using it is also unethical. It has the potential to be ethical, but currently is unethical.

[-] noctivius@lemm.ee 7 points 8 months ago

this is funny i have totally opposite experience

load more comments (1 replies)

[-] ABC123itsEASY@lemmy.world 6 points 8 months ago

Lol why are you getting downvoted this isn't even a hot take. You are 100% right regex is famously enigmatic even among experienced software engineers.

[-] foggy@lemmy.world 1 points 8 months ago* (last edited 8 months ago)

Yeah Lemmy used to have a core of tech Intel and that has slipped hard in the last 6 months.

Be what it do I guess. Dummies gonna dumb.

We are in this sea of like a million people who want to be cybersecurity professionals...

...and as a cybersecurity professional it's adorable when I see vehement dissent.

Like y'all, I've been doing this. And if you want a recommendation, pipe down lol.

[-] ABC123itsEASY@lemmy.world 3 points 8 months ago

Yea I come from the generation of reddit departures that left because of API lockdown and elimination of third party apps. Nowadays a lot of people join Lemmy because they got banned off of reddit for reasons of varying respectability. I would say it's diluting the concentration of tech intel, as you say. Oh well.

[-] foggy@lemmy.world 3 points 8 months ago* (last edited 8 months ago)

Lol yep. Also here from the reason for which you only care about a lot if you have done some kind of web develooment.

Edit: Jesus I just reread that. I literally just ripped the bong. Was a dumb sentence. I'll leave it.

[-] Irelephant@lemm.ee 5 points 8 months ago

If someone's made the regex before, sure.

[-] HStone32@lemmy.world 1 points 8 months ago

I agree, but I don't want to use one until an open source one exists.

load more comments

this post was submitted on 07 Apr 2025

492 points (99.2% liked)

linuxmemes

28518 readers

87 users here now

Hint: :q!

Sister communities:

Community rules (click to expand)

1. Follow the site-wide rules

Instance-wide TOS: https://legal.lemmy.world/tos/
Lemmy code of conduct: https://join-lemmy.org/docs/code_of_conduct.html

2. Be civil

Understand the difference between a joke and an insult.

Do not harrass or attack users for any reason. This includes using blanket terms, like "every user of thing".

Don't get baited into back-and-forth insults. We are not animals.

Leave remarks of "peasantry" to the PCMR community. If you dislike an OS/service/application, attack the thing you dislike, not the individuals who use it. Some people may not have a choice.

Bigotry will not be tolerated.

3. Post Linux-related content

Including Unix and BSD.

Non-Linux content is acceptable as long as it makes a reference to Linux. For example, the poorly made mockery of sudo in Windows.

No porn, no politics, no trolling or ragebaiting.

Don't come looking for advice, this is not the right community.

4. No recent reposts

Everybody uses Arch btw, can't quit Vim, <loves/tolerates/hates> systemd, and wants to interject for a moment. You can stop now.

5. 🇬🇧 Language/язык/Sprache

This is primarily an English-speaking community. 🇬🇧🇦🇺🇺🇸

Comments written in other languages are allowed.

The substance of a post should be comprehensible for people who only speak English.

Titles and post bodies written in other languages will be allowed, but only as long as the above rule is observed.

6. (NEW!) Regarding public figures

We all have our opinions, and certain public figures can be divisive. Keep in mind that this is a community for memes and light-hearted fun, not for airing grievances or leveling accusations.

Keep discussions polite and free of disparagement.

We are never in possession of all of the facts. Defamatory comments will not be tolerated.

Discussions that get too heated will be locked and offending comments removed.

Please report posts and comments that break these rules!

Important: never execute code or follow advice that you don't understand or can't verify, especially here. The word of the day is credibility. This is a meme community -- even the most helpful comments might just be shitposts that can damage your system. Be aware, be smart, don't remove France.

founded 2 years ago

MODERATORS

poopsmith@lemmy.world

zephyr@lemmy.world

rtxn@lemmy.world