Distributed Computing for AI Made Simple
Project Home
Blog
Documents
Paper
Media Coverage
Join Fiber users email list [email protected]
This project is experimental and the APIs are not considered stable.
Fiber is a Python distributed computing library for modern computer clusters.
Originally, it was developed to power large scale parallel scientific computation projects like POET and it has been used to power similar projects within Uber.
pip install fiber
Check here for details.
To use Fiber, simply import it in your code and it works very similar to multiprocessing.
import fiber
if __name__ == '__main__':
fiber.Process(target=print, args=('Hello, Fiber!',)).start()
Note that if __name__ == '__main__':
is necessary because Fiber uses spawn method to start new processes. Check here for details.
Let’s take look at another more complex example:
import fiber
import random
@fiber.meta(cpu=1)
def inside(p):
x, y = random.random(), random.random()
return x * x + y * y < 1
def main():
NUM_SAMPLES = int(1e6)
pool = fiber.Pool(processes=4)
count = sum(pool.map(inside, range(0, NUM_SAMPLES)))
print("Pi is roughly {}".format(4.0 * count / NUM_SAMPLES))
if __name__ == '__main__':
main()
Fiber implements most of multiprocessing’s API including Process
, SimpleQueue
, Pool
, Pipe
, Manager
and it has its own extension to the multiprocessing’s API to make it easy to compose large scale distributed applications. For the detailed API guild, check out here.
Fiber also has native support for computer clusters. To run the above example on Kubernetes, fiber provided a convenient command line tool to manage the workflow.
Assume you have a working docker environment locally and have finished configuring Google Cloud SDK. Both gcloud
and kubectl
are available locally. Then you can start by writing a Dockerfile which describes the running environment. An example Dockerfile looks like this:
# example.docker
FROM python:3.6-buster
ADD examples/pi_estimation.py /root/pi_estimation.py
RUN pip install fiber
Build an image and launch your job
fiber run -a python3 /root/pi_estimation.py
This command will look for local Dockerfile and build a docker image and push it to your Google Container Registry . It then launches the main job which contains your code and runs the command python3 /root/pi_estimation.py
inside your job. Once the main job is running, it will start 4 subsequent jobs on the cluster and each of them is a Pool worker.
We are interested in supporting other cluster management systems like Slurm, if you want to contribute to it please let us know.
Check here for details.
The documentation, including method/API references, can be found here.
Install test dependencies. You’ll also need to make sure docker is available on the testing machine.
$ pip install -e .[test]
Run tests
$ make test
Please read our code of conduct before you contribute! You can find details for submitting pull requests in the CONTRIBUTING.md file. Issue template.
We document versions and changes in our changelog - see the CHANGELOG.md file for details.
This project is licensed under the Apache 2.0 License - see the LICENSE file for details.
@misc{zhi2020fiber,
title={Fiber: A Platform for Efficient Development and Distributed Training for Reinforcement Learning and Population-Based Methods},
author={Jiale Zhi and Rui Wang and Jeff Clune and Kenneth O. Stanley},
year={2020},
eprint={2003.11164},
archivePrefix={arXiv},
primaryClass={cs.LG}
}