setrsome.blogg.se - Dreamview v 0.1 5

#Dreamview v 0.1 5 code#

Python scripts/stable_txt2img.py -ddim_eta 0.0 GenerationĪfter training, personalized samples can be obtained by running the command I train the model use two A6000 GPUs and it takes ~15 mins. Typically the one at 500 steps works well enough. logs//checkpoints, one at 500 steps and one at final step. Training will be run for 800 steps, and two checkpoints will be saved at. If you want to change that, simply make a change in this file. For simplicity, here I just use a random word sks and hard coded it. The original paper approaches this by using a rare word in T5-XXL tokenizer. This identifier needs to be a relatively rare tokens in the vocabulary. The parameter reg_weight corresponds to the weight of regularization in the Dreambooth paper, and the default is set to 1.0.ĭreambooth requires a placeholder word, called identifier, as in the paper. In particular, the default learning rate is 1.0e-6 as I found the 1.0e-5 in the Dreambooth paper leads to poor editability. reg_data_root /root/to/regularization/imagesĭetailed configuration can be found in configs/stable-diffusion/v1-finetune_unfrozen.yaml. actual_resume /path/to/original/stable-diffusion/sd-v1-4-full-ema.ckpt Python main.py -base configs/stable-diffusion/v1-finetune_unfrozen.yaml

The text prompt for generating regularization images can be photo of a, where is a word that describes the class of your object, such as dog. However, here I generated a set of regularization images before the training. Note that in the original paper, the regularization images seem to be generated on-the-fly. Details of the algorithm can be found in the paper. We also need to create a set of images for regularization, as the fine-tuning algorithm of Dreambooth requires that. You can decide which version of checkpoint to use, but I use sd-v1-4-full-ema.ckpt. Weights can be downloaded on HuggingFace. To fine-tune a stable diffusion model, you need to obtain the pre-trained stable diffusion models following their instructions. Usage Preparationįirst set-up the ldm enviroment following the instruction from textual inversion repo, or the original Stable Diffusion repo. The gradient checkpoint is default to be True in config. However, in Dreambooth we optimize the Unet, so we can turn on the gradient checkpoint pointing trick, as in the original SD repo here. This is because in TI, the Unet is not optimized.

#Dreamview v 0.1 5 code#

Remember that this code is based on Textual Inversion, and TI's code base has this line, which disable gradient checkpointing in a hard-code way. : I just found a way to reduce the GPU memory a bit. In fact, due to lazyness, some components in Textual Inversion, such as the embedding manager, are not deleted, although they will never be used here.

The implementation makes minimum changes over the official codebase of Textual Inversion. Note that Textual Inversion only optimizes word ebedding, while dreambooth fine-tunes the whole diffusion model. This code repository is based on that of Textual Inversion. To enable people to fine-tune a text-to-image model with a few examples, I implemented the idea of Dreambooth on Stable diffusion. However, neither the model nor the pre-trained weights of Imagen is available. The original Dreambooth is based on Imagen text-to-image model. This is an implementtaion of Google's Dreambooth with Stable Diffusion.