A Modular Library for Off-Policy Reinforcement Learning with a focus on SafeRL and distributed computing
A Modular Library for Off-Policy Reinforcement Learning with a focus on SafeRL and distributed computing. The code supports SafetyGymnasium
environment set for giving a starting point developing SafeRL solutions. Distributed setting is implemented via pika
library and will be improved in the near future.
Environments Support
DMControl Suite | SafetyGymnasium | Gymnasium |
---|
Algorithms
DDPG | TD3 | SAC | TQC |
---|
The project supports uv for package managment and ruff for formatting checks. To install it via uv in virutalenv:
uv venv
source .venv/bin/activate
uv sync
For working with SafetyGymnasium install it manually
git clone https://github.com/PKU-Alignment/safety-gymnasium
cd safety-gymnasium && uv pip install -e .
To run tests locally:
uv pip install pytest
uv run pytest tests/functional
All training is set via python config files located in configs
folder. To make your own configuration, change the code there or create a similar one. During training, all the code is copied to logs folder to ensure full experimental reproducibility.
To run DDPG in a single process
python configs/ddpg.py --env walker-walk
Run RabbitMQ
docker run -it --rm --name rabbitmq -p 5672:5672 -p 15672:15672 rabbitmq:3.12-management
Run training
python configs/distrib_ddpg.py --env walker-walk
Results for single process DDPG and TQC:
OPRL
@inproceedings{
kuznetsov2024safer,
title={Safer Reinforcement Learning by Going Off-policy: a Benchmark},
author={Igor Kuznetsov},
booktitle={ICML 2024 Next Generation of AI Safety Workshop},
year={2024},
url={https://openreview.net/forum?id=pAmTC9EdGq}
}
SafetyGymnasium
@inproceedings{ji2023safety,
title={Safety Gymnasium: A Unified Safe Reinforcement Learning Benchmark},
author={Jiaming Ji and Borong Zhang and Jiayi Zhou and Xuehai Pan and Weidong Huang and Ruiyang Sun and Yiran Geng and Yifan Zhong and Josef Dai and Yaodong Yang},
booktitle={Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
year={2023},
url={https://openreview.net/forum?id=WZmlxIuIGR}
}