Running SimpleTuner/Flux on a Linux instance

Special thanks to https://github.com/bghira/SimpleTuner/blob/main/documentation/quickstart/FLUX.md for much of this tutorial. A reminder that only basics are covered here.

Go to https://dashboard.tensordock.com/deploy_preconfig and select the 1x h100 configuration. Adding an ssh key is recommended:

Deploy and then SSH into the instance with the provided command. Once inside, you may see the following:

Run "sudo reboot". This will restart to the VM. Wait a couple minutes and connect again. Once you are back in run "sudo apt update" and "sudo apt upgrade" in that order. You might come across this screen:

Use your tab key to navigate to "<Ok>" and press enter to continue. Now use

python --version

You should be using python 3.11 or 3.10. Now you should run the following command:

sudo apt -y install nvidia-cuda-toolkit libgl1-mesa-glx

If that is not found, then run this instead:

sudo apt -y install nvidia-cuda-toolkit libgl1-mesa-dri

With all this set up, now we can clone the SimpleTuner repository and set up a virtual environment. Run the following commands:

git clone --branch=release https://github.com/bghira/SimpleTuner.git

cd SimpleTuner

# if python --version shows 3.11 you can just also use the 'python' command here.
python3.11 -m venv .venv

source .venv/bin/activate

pip install -U poetry pip

poetry install --no-root

Flux needs a more specific version of diffusers.

pip uninstall diffusers

pip install git+https://github.com/huggingface/diffusers

Now that we have our dependencies, we can set up SimpleTuner itself with a configuration process. First create a configure.py file in the SimpleTuner directory copying this one https://github.com/bghira/SimpleTuner/blob/main/configure.py. Run configure.py with

python configure.py

It will ask you a series of questions. All of the actual configuration related questions have default answers. Use the default answers when possible. If you go to /config/config.env your config file should look like this:

RESUME_CHECKPOINT='latest'
DATALOADER_CONFIG='config/multidatabackend.json'
ASPECT_BUCKET_ROUNDING='2'
TRAINING_SEED='42'
USE_EMA='false'
USE_XFORMERS='false'
MINIMUM_RESOLUTION='0'
OUTPUT_DIR='output/models'
USE_DORA='false'
USE_BITFIT='false'
PUSH_TO_HUB='true'
PUSH_CHECKPOINTS='true'
MAX_NUM_STEPS='10000'
NUM_EPOCHS='0'
CHECKPOINTING_STEPS='500'
CHECKPOINTING_LIMIT='5'
HUB_MODEL_NAME='simpletuner-lora'
TRACKER_PROJECT_NAME='lora-training'
TRACKER_RUN_NAME='$(date +%s)'
DEBUG_EXTRA_ARGS='--report_to=wandb'
MODEL_TYPE='lora'
MODEL_NAME='black-forest-labs/FLUX.1-dev'
FLUX='true'
KOLORS='false'
STABLE_DIFFUSION_3='false'
STABLE_DIFFUSION_LEGACY='false'
FLUX_LORA_TARGET='all+ffs'
TRAIN_BATCH_SIZE='1'
USE_GRADIENT_CHECKPOINTING='true'
GRADIENT_ACCUMULATION_STEPS='2'
CAPTION_DROPOUT_PROBABILITY='0.1'
RESOLUTION_TYPE='area'
RESOLUTION='1024'
VALIDATION_SEED='42'
VALIDATION_STEPS='500'
VALIDATION_RESOLUTION='1024x1024'
VALIDATION_GUIDANCE='7.5'
VALIDATION_GUIDANCE_RESCALE='0.0'
VALIDATION_NUM_INFERENCE_STEPS='20'
VALIDATION_PROMPT='A photo-realistic image of a cat'
ALLOW_TF32='false'
MIXED_PRECISION='bf16'
OPTIMIZER='adamw_bf16'
LEARNING_RATE='8e-5'
LR_SCHEDULE='polynomial'
LR_WARMUP_STEPS='100'
ACCELERATE_EXTRA_ARGS=''
TRAINING_NUM_PROCESSES='1'
TRAINING_NUM_MACHINES='1'
VALIDATION_TORCH_COMPILE='false'
TRAINER_DYNAMO_BACKEND='no'
TRAINER_EXTRA_ARGS='--lora_rank=64 --lr_end=1e-8 --gradient_precision=unmodified --compress_disk_cache'

After that we can install the data set we will be using. First navigate to your output directory. If you used the defaults in the configuration step, then it should be called:

output/models

Create a file called multidatabackend.json (add this also to the SimpleTuner/config) take note that instance_data_dir will end up being the directory of your dataset and may vary if you input a custom output directory:

[
    {
        "id": "pseudo-camera-10k-flux",
        "type": "local",
        "crop": true,
        "crop_aspect": "square",
        "crop_style": "center",
        "resolution": 512,
        "minimum_image_size": 512,
        "maximum_image_size": 512,
        "target_downsample_size": 512,
        "resolution_type": "pixel",
        "cache_dir_vae": "cache/vae/flux/pseudo-camera-10k",
        "instance_data_dir": "/home/user/SimpleTuner/output/models/datasets/pseudo-camera-10k",
        "disabled": false,
        "skip_file_discovery": "",
        "caption_strategy": "filename",
        "metadata_backend": "json"
    },
    {
        "id": "text-embeds",
        "type": "local",
        "dataset_type": "text_embeds",
        "default": true,
        "cache_dir": "cache/text/flux/pseudo-camera-10k",
        "disabled": false,
        "write_batch_size": 128
    }
]

Now while you are still in your output directory, run the following commands to install the dataset:

sudo apt -y install git-lfs
mkdir -p datasets
pushd datasets
    git clone https://huggingface.co/datasets/ptx0/pseudo-camera-10k
popd

Now go back to the main SimpleTuner directory and run this command:

bash train.sh

Last updated