Refactoring dalle-pytorch and taming-transformers for TPU VM
Refactoring
Taming Transformers and DALLE-pytorch
for TPU VM with Pytorch Lightning
pip install -r requirements.txt
Place any image dataset with ImageNet-style directory structure (at least 1 subfolder) to fit the dataset into pytorch ImageFolder.
You can easily test main.py with randomly generated fake data.
python train_vae.py --use_tpus --fake_data
For actual training provide specific directory for train_dir, val_dir, log_dir:
python train_vae.py --use_tpus --train_dir [training_set] --val_dir [val_set] --log_dir [where to save results]
python train_dalle.py --use_tpus --train_dir [training_set] --val_dir [val_set] --log_dir [where to save results] --vae_path [pretrained vae] --bpe_path [pretrained bpe(optional)]
@misc{oord2018neural,
title={Neural Discrete Representation Learning},
author={Aaron van den Oord and Oriol Vinyals and Koray Kavukcuoglu},
year={2018},
eprint={1711.00937},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
@misc{razavi2019generating,
title={Generating Diverse High-Fidelity Images with VQ-VAE-2},
author={Ali Razavi and Aaron van den Oord and Oriol Vinyals},
year={2019},
eprint={1906.00446},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
@misc{esser2020taming,
title={Taming Transformers for High-Resolution Image Synthesis},
author={Patrick Esser and Robin Rombach and Björn Ommer},
year={2020},
eprint={2012.09841},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
@misc{ramesh2021zeroshot,
title = {Zero-Shot Text-to-Image Generation},
author = {Aditya Ramesh and Mikhail Pavlov and Gabriel Goh and Scott Gray and Chelsea Voss and Alec Radford and Mark Chen and Ilya Sutskever},
year = {2021},
eprint = {2102.12092},
archivePrefix = {arXiv},
primaryClass = {cs.CV}
}