overview for duncesplayed

'Game of Thrones' creator and other authors sue ChatGPT-maker OpenAI for copyright infringement by duncesplayed in c/technology@lemmy.ml

[-] duncesplayed@lemmy.one 2 points 1 year ago

(I think you're arguing from an ethical standpoint whereas OP was arguing legally, but anyway....)

Theoretically, someone would be able ask an A.I. to recite an entire book for them

No, that shouldn't happen. If an AI were ever able to recite back its training data verbatim, that AI would be overfitting. It happens by accident sometimes early on in development when your training data is too small and your model is too big, but it's an error, and is something to be avoided and corrected.

The whole point of training is to get it to a point where it can't recite back any of its training data. In order for that to happen, the AI is forced to sort of generalize and abstract (sorry for anthropomorphizing) its training data. That's the only way to get it to be able to generate something new, which is the whole point of the endeavour.

Long story short, if an AI could recite back an entire book, by definition it could not be an AI, and it wouldn't resemble any of the popular LLMs we have now like ChatGPT. (But you may see snippets and pastiches and watermarks show up)

Internet developments have gone from exciting to dreadful. by duncesplayed in c/technology@lemmy.ml

[-] duncesplayed@lemmy.one 2 points 1 year ago* (last edited 1 year ago)

PGP itself is a bit of mess.

For one thing, there's really only one major/popular implementation of it these days, which is GPG. The codebase is arcane. Pretty major security vulnerabilities pop up constantly. It doesn't have stable funding. Several years ago the entire project almost collapsed when the world discovered it had been maintained for several years by a single person who didn't have any time or money to maintain it. The situation is a little bit better now, but not much.

(For this reason, people are starting to use age instead of gpg, as the code is much smaller, cleaner, forces safe defaults, and doesn't seem to have security problems)

But the bigger problem that was never properly solved with PGP is key distribution. How do you get somebody's key in the first place? Some people put their keys on their own personal (https) webpage, which is fine, but that's not a solution for everyone, and doesn't scale very well. Okay, so you might use a key server, but that has privacy implications (your identity is essentially public to the world) and centralizes everything down to a handful of small "trusted" key servers (since there would be no way to trust key servers in a decentralized way). We should probably just have email servers themselves serve keys somehow, but nobody's put that into the email standard protocols.

The fact that keys expire amplifies all the problems with key distribution, and encourages people to do really unsafe things with keys, like just blindly trust them. You can sign other people's keys for them, but that also does not scale very well.

The key distribution problem is something that things like Signal have "solved" with things like phone number verification, but there's really no clear way to solve it on something totally distributed like email.

What feature/utility/app are you surprised is not installed by default in Linux distributions? by duncesplayed in c/linux@lemmy.ml

[-] duncesplayed@lemmy.one 2 points 1 year ago

It was, but it was (and still is) a Unix tool. I believe POSIX still requires that more be provided (even if it's just less secretly).

The original Unix more could only go forwards. Someone wanted to make something like more that could go both forwards and backwards, so he called it less as a joke (because "less" is a "backwards more"). For the past 40 years, everyone's realized that less is much better than the original more, so nobody uses the original any more.

(MSDOS took the idea of "more" before "less" caught on).

Today GNU/Linux is 32 years old by duncesplayed in c/linux@lemmy.ml

[-] duncesplayed@lemmy.one 2 points 1 year ago

It doesn't change the larger point that GNU is way bigger than Linux, though. There are a tonne of things that are larger than Linux, and GNU is one of them.

Requiring ink to scan a document—yet another insult from the printer industry by duncesplayed in c/technology@lemmy.ml

[-] duncesplayed@lemmy.one 2 points 1 year ago

Unfortunately toner gives you cancer. (If anyone sits close to a laser printer, know that Ultra-Fine Particle levels fall back to background levels within 1-2 minutes after printing. Maybe take an extra-long poop when printing). For someone who doesn't print very often, probably the cancer risk is not very large, though.

Devuan 5.0 Released For Debian 12 Without systemd by duncesplayed in c/linux@lemmy.ml

[-] duncesplayed@lemmy.one 2 points 1 year ago

Interestingly, someone who hates systemd would have written exactly the same blog with exactly the same reasons. "Systemd is incredibly versatile and most people, including myself, are unaware of its full potential" could very well be verbatim the slogan of the anti-systemd faction.

*Permanently Deleted* by duncesplayed in c/technology@lemmy.ml

[-] duncesplayed@lemmy.one 2 points 1 year ago

Yes, I am strongly opposed to it.

Whether Musk receives public ridicule has absolutely zero impact on Musk, and a huge impact on the rest of us. It ruins the community to obsess over someone for no reason.

RMS: How I do my Computing by duncesplayed in c/opensource@lemmy.ml

[-] duncesplayed@lemmy.one 2 points 1 year ago* (last edited 1 year ago)

In Forth, you can do things like, say, redefine the number 0 to be computed as a function, and all code that uses the number 0 will instantly change its behaviour at runtime. Why would you do that? I've never found a legitimate use for it, which is why I hate Forth (and Lisp, for similar reasons). I like static analysis and I like it when the language prevents me from doing something silly, but I can understand why some people like the elegance and power-rush from one of the god-like languages like Lisp.

How will I know how many services I can run on my self hosted server? by duncesplayed in c/selfhosted@lemmy.world

[-] duncesplayed@lemmy.one 2 points 1 year ago* (last edited 1 year ago)

BitWarden+PiHole+NextCloud+Wireguard combined will add to like maybe 100MB of RAM or so.

Where it gets tricky, especially with something like NextCloud, is the performance you see from NextCloud will depend tremendously on what kind of hard drives you have and how much of it can be cached by the OS. If you have 4GB of RAM, then like 3.5GB-ish of that can be used as cache for NextCloud (and whatever else you have that uses considerable storage). If you have tiny NextCloud storage (like 3.5GB or less), then your OS can keep the entire storage in cache, and you'll see lightning-fast performance. If you have larger storage (and are actually accessing a lot of different files), then NextCloud will actually have to touch disk, and if you're using a mechanical (spinning rust) hard drive, you will definitely see the 1-second lag here and there for when that happens.

And then if you have something like Immich on top of that....

And then if you have transmission on top of that....

Anything that is using considerable filesystem space will be fighting over your OS's filesystem cache. So it's impossible to say how much RAM would be enough. 512MB could be more than enough. 1TB could be not enough. It depends on how you're using it and how tolerant you are of cache misses.

Mostly you won't have to think about CPU. Most things (like NextCloud) would be using like <0.1% CPU. But there are some exceptions.

Notably, Wireguard (or anything that requires encryption, like an HTTPS server) will have CPU usage that depends on your throughput. Wireguard, in particular, has historically been a heavy CPU user once you get up to like 1Gbit/s. I don't have any recent benchmarks, but if you're expecting to use Wireguard beyond 1Gbit/s, you may need to look at your CPU.

NTFS turns 30 years old today! I hear it's still in use by some crufty old legacy operating systems 😁 by duncesplayed in c/technology@lemmy.ml

[-] duncesplayed@lemmy.one 2 points 1 year ago

To the best of my knowledge, most of the limitations are around allocation. NTFS doesn't allow for extent-based allocation, delayed allocation, uninitialized allocation, etc. It only has one allocation mode, which is the traditional block-at-a-time (actually "cluster"-at-a-time, though NTFS's clusters are roughly block-sized compared to other filesystems), which is now thought to be slightly less than ideal in terms of allocation performance and fragmentation.

And...speaking of fragmentation, I believe NTFS still can't do online defragmentation??? I can't see anything that contradicts this, but it's possible I'm out of date.

There are other small differences. NTFS has unnecessary filename restrictions, like prohibiting " and ? and things. But that's typically less important.

RISC-V Is Now An Official Debian Architecture by duncesplayed in c/linux@lemmy.ml

[-] duncesplayed@lemmy.one 2 points 1 year ago

It's predominantly the first one. They have made a few unique design decisions, but is a fairly conservative "boring" RISC design. The only thing remarkable I can think of of the core ISA is the fact that they have no conditional status registers (no NZVC bits), so you have to kind of combine conditions and branches together, but that's not exactly unprecedented (MIPS did something similar).

In the ISA extensions, there is still some instability and disagreement about the best ISA design for some parts. Just the fact that RISC-V is going to have both SIMD and Vector instructions is a bit unique, but probably won't make a huge difference.

But it's a fairly boring RISC design which is free and open and without any licensing hoops to jump through, which is the most interesting bit.

Getting school pictures done, dont feel like paying. They send them out with a watermark. Any tools to remove watermark that run on my machine not the web? by duncesplayed in c/privacyguides@lemmy.one

[-] duncesplayed@lemmy.one 2 points 1 year ago* (last edited 1 year ago)

There are a few projects here and there that do it, like this one (haven't tried it). Generally if you find projects online like that, they're generally written for the purposes of a particular academic paper and might not work very well for the general case, but the worst you can do is try it out.

The "random ass websites" you find that do a good job of it probably have put more work into training a good general-purpose model. If you want, you could try doing that yourself, but it would require a fair bit of work.