The RL is so good grok changed it's personality by changing small part of it's system prompt
Llama 3.3 was good, tho. For the multimodal, llama 4 also use llama3.2 approach where the image and text is made into single model instead using CLIP or siglip.
They got the whole Twitter database. It's kinda the same with Gemini. But somehow Meta isn't catching up, maybe their llama 4 architecture isn't that stable to train.
There is new project which they share fine-tuned modernbert on some task. Here is the org https://huggingface.co/adaptive-classifier
It changed after Grok 3
Lots of developer choose to write in CUDA as ROCm support back then is a mess.
No, you can run sd, flux based model inside the koboldcpp. You can try it out using the original koboldcpp in google colab. It loads gguf model. Related discussion on Reddit: https://www.reddit.com/r/StableDiffusion/comments/1gsdygl/koboldcpp_now_supports_generating_images_locally/
Edit: Sorry, I kinda missed the point, maybe I'm sleepy when writing that comment. Yeah, I agree that LLM need big memory to run which is one of it's downside. I remember someone doing comparison that API with token based pricing is cheaper that to run it locally. But, running image generation locally is cheaper than API with step+megapixel pricing.
Skywork downfall
There is koboldcp-rocm fork. Koboldcpp itself has basic image generation. https://github.com/YellowRoseCx/koboldcpp-rocm
So something like