29
top 15 comments
sorted by: hot top controversial new old
[-] INeedMana@piefed.zip 69 points 3 days ago

I can’t report because I haven’t validated them yet… I’m not going to send [the Linux kernel maintainers] potential slop

That's worth pointing out IMO

[-] codeinabox@programming.dev 15 points 3 days ago

Though that quote is followed by this, which indicates at least five of those vulnerabilities were real:

I searched the Linux kernel and found a total of five Linux vulnerabilities so far that Nicholas either fixed directly or reported to the Linux kernel maintainers, some as recently as last week:

[-] entwine@programming.dev 15 points 2 days ago

I wonder how true that is. The author of this blog post seems to just be taking this guy's word for it. Did Anthropic actually confirm the bug exists by trying to trigger it on real systems, or are they assuming it's real because it looks plausible? The report claims you cam do it with two cooperating NFS clients, so did they actually do that, or are they just assuming it'll work?

[-] Aatube@lemmy.dbzer0.com 1 points 2 hours ago* (last edited 2 hours ago)

Those are five bugs the kernel maintainers have reviewed and decided to patch (the links are to the commits), not just five bug reports. I think that leans towards “they tested it” or at least “proofed the formal logic in their minds successfully”.

[-] TehPers@beehaw.org 60 points 3 days ago

My favorite kind of graphs are ones where an entire axis is unlabeled:

bugs found vs LLM model

You see this a lot with marketing graphs. They say nothing, but they're designed to convince you that the graphs mean something.

Anyway, it's neat they found and fixed, supposedly, some real bugs. I'm curious how many fake reports they had to sift through to find any real ones.

[-] illusionist@lemmy.zip 10 points 3 days ago* (last edited 3 days ago)

He writes 100 and that he did not yet sift through them.

But I am wondering the same. How often he had to run the same command and how long to find the right prompt

[-] ATS1312@lemmy.dbzer0.com 43 points 3 days ago

Real vulnerability or hallucinated?

[-] RushLana@lemmy.blahaj.zone 38 points 3 days ago

Given Nicholas Carlini work at anthropic I would wait for another person to confirm this.

The research method is just pointing file by file and asking an LLM if any vulnerability exist and reminds me of the person who bugged ffmpeg devs with vulnerabilities on niche non enabled codec decryption.

[-] far_university1990@reddthat.com 23 points 3 days ago

reminds me of the person who bugged ffmpeg devs with vulnerabilities on niche non enabled codec decryption.

That was google.

https://itsfoss.com/news/ffmpeg-google-fiasco/

[-] luciole@beehaw.org 3 points 2 days ago

This would be meaningful if the findings were not produced by the corp trying to sell you the product being hyped. Big tech has a history of "faking it till you make it" and I can't help but doubt that this is really just Claude Code mostly autonomously finding issues.

[-] chocrates@piefed.world 4 points 2 days ago

I'm really scared about what AI is going to do to the world, but I think it's here to stay.
Hopefully it's actually finding real bugs

[-] lambalicious@lemmy.sdf.org -4 points 3 days ago

If I ever received a vuln report from an AI, or other such glorified spreadsheet, I would promptly dismiss it then wait for a human to organically discover it on its own to consider that as proof of actual existence.

[-] Pika@sh.itjust.works 12 points 2 days ago* (last edited 2 days ago)

If the bug was actually legitimate, and was verified, I don't think its a good idea to just wait till someone actually experiences it.

Of course this depends on the severity of the bug as well. In the case of this article, he was refusing to submit anything until he actually verified it, but he defo was using the AI as a origin of discovery.

I would prefer those types of reports over blanket AI vulnerability reports that aren't proven. Discrediting a valid bug because it was not human generated may lessen workflow, but it's at the cost of your software's security and reliability.

I agree I would throw out reports that are AI driven & not proven, but if someone did the actual PoC and demonstrated actual risk I wouldn't care if it was originally AI or not. I would just assign it based off severity like normal.

[-] FauxLiving@lemmy.world 3 points 2 days ago

Letting your users get hacked just to own the AIs is certainly a strategy.

this post was submitted on 03 Apr 2026
29 points (73.0% liked)

Linux

13164 readers
1034 users here now

A community for everything relating to the GNU/Linux operating system (except the memes!)

Also, check out:

Original icon base courtesy of lewing@isc.tamu.edu and The GIMP

founded 2 years ago
MODERATORS