1
1
2
1
3
1

Among other improvements, the new defaults set --flux_guidance_value=1, removing the need to use CFG nodes at inference, reducing generation time and improving image quality of LoRAs slightly.

Changelog: https://github.com/bghira/SimpleTuner/releases/tag/v0.9.8.1

Sample LoRA: https://huggingface.co/ptx0/flux-dreambooth-lora-r16-dev-cfg1/blob/main/pytorch_lora_weights.safetensors

4
1
Pose picker extension? (poptalk.scrubbles.tech)

I think this was here a while ago, but I lost the post. If I remember, there was an extension where someone had built up a repository of poses, and then in the UI you could go through them and pick out the one you wanted for your image. Does anyone remember that? I have controlnet and can do poses, but finding them is a bear right now.

5
1
6
1
7
1
8
1

From The Hugging Face Model Card:

Not Ready

This is a WIP and not ready for use. This is an early testing version for research and development. You may know what this is and how to use it, if so, feel free, but it will change as I continue to develop it. I plan to do many updates to it frequently. So you may want to set a revision if you intend to use it anyway.

What is this?

FLUX.1-schnell is an amazing distilled model with an apache 2.0 license. However, it is not finetunable. LoRAs, IP adapters, control nets, etc, cannot be trained on it because it is distilled. The goal of this project is to finetune a non-distilled version of it that can be used as a training base to train adapters for FLUX.1-schnell.

Current Issues

Since we are breaking the distillation, this model will need many steps and guidance to produce good results. Currently, this model, like the schnell version, does not have guidance embeddings. Because of this (and possible other factors) images generated with this model will not look great. However, this hopefully will not affect training, since guidance is not used during training. The things trained on this model are intended to be used on the schnell version anyway. I am working on training guidance embeddings for it, but hopefully it will work as a training base without them.

9
1

Quoted From Reddit:

Release: https://github.com/bghira/SimpleTuner/releases/tag/v0.9.8

It's here! Runs on 24G cards using Quanto's 8bit quantisation or down to 13G with a 2bit base model for the truly terrifying potato LoRA of your dreams!

If you're after accuracy, a 40G card will do Just Fine, with 80G cards being somewhat of a sweet spot for larger training efforts.

What you get:

  • LoRA, full tuning (but probably just don't do that)
  • Documentation to get you started fast
  • Probably better for just square crop training for now - might artifact for weird resolutions
  • Quantised base model unlocks the ability to safely use Adafactor, Prodigy, and other neat optimisers as a consolation prize for losing access to full bf16 training (AdamWBF16 just won't work with Quanto)

not a fine-tune, but, Flux-fast

frequently observed questions

  • 10k images isn't a requirement for training, that's just a healthy amount of regularisation data to have.

  • Regularisation data with text in it is needed to retain text while tuning Flux. It's sensitive to forgetting.

  • you can finetune either dev or schnell, and you probably don't even need special training dynamics for schnell. it seems to work just fine, but at lower quality than dev, because the base model is lower quality.

  • yes, multiple 4090s or 3090s can be used. no, it's probably not a good idea to try splitting the model across them - stick with quantising and LoRAs.

thank you

You all had a really good response to my work; as well as respect for the limitations of the progress at that point, and the optimism on what can happen next.

I'm not sure whether we can really "improve" this state of the art model - probably merely being able to change it without ruining it is good enough for me.

further work, help needed

If any of you would like to take on any of the items in this issue, we can implement them into SimpleTuner next and unlock another level of fine-tuning efficiency: https://github.com/huggingface/peft/issues/1935

The principle improvement for Flux here will be the ability to train quantised LoKr models, where even the weights of the LoRA itself become quantised in addition to the base model.

10
1
11
1
12
1
13
1
submitted 3 months ago* (last edited 3 months ago) by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

Github user bghira discovered both Schnell and Dev are distilled from Black Forest Labs' Pro model and probably won't be traditionally tuneable.

14
1
submitted 3 months ago* (last edited 3 months ago) by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

Visit the site to move the slider and select the other options from the drop-down menu.

15
1
submitted 3 months ago* (last edited 3 months ago) by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com
16
1
17
1
18
1
19
1
20
1
submitted 3 months ago* (last edited 3 months ago) by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

Flux is an open source 12B parameter text-to-image model by Black Forest Labs—the original team behind Stable Diffusion. Needs 24GB of VRAM to run.

FLUX.1 [dev]: The base model, open-sourced with a non-commercial license for community to build on top of. https://huggingface.co/black-forest-labs/FLUX.1-dev

FLUX.1 [schnell]: A distilled version of the base model that operates up to 10 times faster. Apache 2 Licensed. https://huggingface.co/black-forest-labs/FLUX.1-schnell

Blog: https://blackforestlabs.ai/announcing-black-forest-labs/

GitHub: https://github.com/black-forest-labs/flux

21
1
22
1
23
1
submitted 3 months ago* (last edited 3 months ago) by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com
24
1
25
1
view more: next ›

Stable Diffusion

4194 readers
3 users here now

Discuss matters related to our favourite AI Art generation technology

Also see

Other communities

founded 1 year ago
MODERATORS