From the grifters, to the chanlord fascists, to the pedophiles, to the people whose only crime is just being too cringe, the community is a toxic morass that is completely unsalvageable. And the creepiest part is how it's all just beneath the surface, hidden in plain site. For example, if one browses checkpoints on civitai you'll inevitably run across galleries showing off how well the checkpoint can do porn in one image, and then the next image will be a SFW picture showing off how well it can generate children, with the creator saying something like "remember to add words like loli and child to the negative prompt when generating NSFW pictures just in case," in the description.
Then there's this absurd gem from a wildcard collection that I poked through to learn how wildcards work (turns out it's literally just a list of options broken up with newlines; and dynamic ones are the same but can be nested until no wildcards or dynamic prompt syntax remains). But before you click this spoiler, I want you to imagine what race.txt contains, really think about what an AI guy would put in there, then see how close you were:
british
czech
european
french
german
hungarian
icelandic
irish
italian
jewish
polish
portuguese
russian
[russian:japanese:0.3]
spanish
swedish
ukrainian
welsh
That's right, it's 100% weird euro brainworms splitting hairs between flavors of european, and one context switching prompt that switches from russian to japanese 30% of the way through generation to make sure that the one non-white entry starts off white. Not sure why he even bothers, since most of the checkpoints are so overtrained on white women that they will always spit out extremely pale figures regardless of what the prompt says. .
Second conclusion: stable diffusion itself is actually a pretty fun toy, as long as you ignore the community. Fighting with it to make it not suck is an engaging challenge, and hitting the generate button to see if you've succeeded is like pulling a slot machine lever. Learning how to control this inscrutable eldritch machine is indisputably fun, despite everything around it.
Third conclusion: stable diffusion is fucking terrifying, and at this point is actually good at what it does with modern checkpoints. SDXL is obviously a step up, but even SD1.5 has been refined to the point that it's starting to lose the obvious tells as long as it's used right. The state it's in now is absurdly different from where it was six months ago and almost unrecognizable from where it was a year ago.
Fourth conclusion: stable diffusion is a horrifyingly addictive skinner box that mainlines psychic damage directly into your brain. It's an infinite gacha machine that you pay for with electricity and time instead of microtransactions. It's like introverted doomscrolling. It's so captivating that it's consumed almost every waking moment of my life for the past week, and I've only escaped by sequestering it onto a linux partition and breaking my stable diffusion install on windows in a way that would take a conscious effort to fix while trying to optimize it.
Fifth conclusion/summary: stable diffusion is a cognitohazard being actively shaped by the worst people alive, and there's no solution in sight. There was some faint hope that Nightshade could slow it down, but so far the buzz around that seems to be that it actually improves the models because its concept poisoning introduces noise that prevents overtraining while still helping to refine it, but then that's coming from the stable diffusion community so that's unreliable info at best.
Still, the fact that something open source and completely uncontrollable has become as good as stable diffusion already is and that there's every indication it will only continue to be refined and improved on is almost a relief, compared to the alternative of it being exactly the same but also the private and fully enclosed property of corporations run by the literal worst people alive. I really can't help but take some solace in the fact that open models are competing effectively with the proprietary ones, and may even win out. I sure as hell don't want see those OpenAI ghouls come out on top, because even if most of the stable diffusion community is irredeemably awful at least some it is just sort of cringe.
This is a wild tangent, but for some reason that idea reminds me of the novelization of Myst of all things, where a plot point around the whole "creating worlds by describing them in detail" thing involved the protagonist going into obsessive detail about every minor detail of the setting and being scolded for not being minimalist and exclusively focusing on the functional parts like "there's air" and "this place is useful and also not on fire or made of poison or some shit like that" by his father who erases the added lines, yielding worlds that are shitty and don't work right.
For all that it's a rather on-the-nose allegory for writing and scene setting in general, it's eerily similar to how stacking the right added details in a prompt can massively impact the entire image, including unrelated parts, in stable diffusion. Like left without them it just sort of fuzzily makes a generic average that might be ok if generic or it might make a limb fold back in on itself, disappear behind a narrow object, and reappear somewhere else entirely like it's a fucking looney toons gag. But setting up something to painstakingly describe the color and texture of the literal dirt on the ground in the picture can somehow impact and fix the detail and perspective of figures in the scene, like it's trying to make everything match the intricacy and so not falling into the weird impossible contortion and melting zones.
Myst mentioned :D