[CVPR'24 Highlight] Official PyTorch implementation of CoDeF: Content Deformation Fields for Temporally Consistent Video Processing
Hao Ouyang*, Qiuyu Wang*, Yuxi Xiao*, Qingyan Bai, Juntao Zhang, Kecheng Zheng, Xiaowei Zhou,
Qifeng Chen†, Yujun Shen† (*equal contribution, †corresponding author)
CVPR 2024 Highlight
The codebase is tested on
To use video visualizer, please install ffmpeg
via
sudo apt-get install ffmpeg
For additional Python libraries, please install with
pip install -r requirements.txt
Our code also depends on tiny-cuda-nn.
See this repository
for Pytorch extension install instructions.
We have provided some videos here for quick test. Please download and unzip the data and put them in the root directory. More videos can be downloaded here.
We segement video sequences using SAM-Track. Once you obtain the mask files, place them in the folder all_sequences/{YOUR_SEQUENCE_NAME}/{YOUR_SEQUENCE_NAME}_masks
. Next, execute the following command:
cd data_preprocessing
python preproc_mask.py
We extract optical flows of video sequences using RAFT. To get started, please follow the instructions provided here to download their pretrained model. Once downloaded, place the model in the data_preprocessing/RAFT/models
folder. After that, you can execute the following command:
cd data_preprocessing/RAFT
./run_raft.sh
Remember to update the sequence name and root directory in both data_preprocessing/preproc_mask.py
and data_preprocessing/RAFT/run_raft.sh
accordingly.
After obtaining the files, please organize your own data as follows:
CoDeF
│
└─── all_sequences
│
└─── NAME1
└─ NAME1
└─ NAME1_masks_0 (optional)
└─ NAME1_masks_1 (optional)
└─ NAME1_flow (optional)
└─ NAME1_flow_confidence (optional)
│
└─── NAME2
└─ NAME2
└─ NAME2_masks_0 (optional)
└─ NAME2_masks_1 (optional)
└─ NAME2_flow (optional)
└─ NAME2_flow_confidence (optional)
│
└─── ...
You can download checkpoints pre-trained on the provided videos via
Sequence Name | Config | Download | OpenXLab |
---|---|---|---|
beauty_0 | configs/beauty_0/base.yaml | Google drive link | |
beauty_1 | configs/beauty_1/base.yaml | Google drive link | |
white_smoke | configs/white_smoke/base.yaml | Google drive link | |
lemon_hit | configs/lemon_hit/base.yaml | Google drive link | |
scene_0 | configs/scene_0/base.yaml | Google drive link |
And organize files as follows
CoDeF
│
└─── ckpts/all_sequences
│
└─── NAME1
│
└─── EXP_NAME (base)
│
└─── NAME1.ckpt
│
└─── NAME2
│
└─── EXP_NAME (base)
│
└─── NAME2.ckpt
|
└─── ...
./scripts/train_multi.sh
where
GPU
: Decide which GPU to train on;NAME
: Name of the video sequence;EXP_NAME
: Name of the experiment;ROOT_DIRECTORY
: Directory of the input video sequence;MODEL_SAVE_PATH
: Path to save the checkpoints;LOG_SAVE_PATH
: Path to save the logs;MASK_DIRECTORY
: Directory of the preprocessed masks (optional);FLOW_DIRECTORY
: Directory of the preprocessed optical flows (optional);Please check configuration files in configs/
, and you can always add your own model config.
./scripts/test_multi.sh
After running the script, the reconstructed videos can be found in results/all_sequences/{NAME}/{EXP_NAME}
, along with the canonical image.
After obtaining the canonical image through this step, use your preferred text prompts to transfer it using ControlNet.
Once you have the transferred canonical image, place it in all_sequences/${NAME}/${EXP_NAME}_control
(i.e. CANONICAL_DIR
in scripts/test_canonical.sh
).
Then run
./scripts/test_canonical.sh
The transferred results can be seen in results/all_sequences/{NAME}/{EXP_NAME}_transformed
.
Note: The canonical_wh
option in the configuration file should be set with caution, usually a little larger than img_wh
, as it determines the field of view of the canonical image.
@article{ouyang2023codef,
title={CoDeF: Content Deformation Fields for Temporally Consistent Video Processing},
author={Hao Ouyang and Qiuyu Wang and Yuxi Xiao and Qingyan Bai and Juntao Zhang and Kecheng Zheng and Xiaowei Zhou and Qifeng Chen and Yujun Shen},
journal={arXiv preprint arXiv:2308.07926},
year={2023}
}
We thank camenduru for providing the colab demo.