A tool for programmatically verifying database backups
Schrödinger’s Backup: “The condition of any backup is unknown until a restore is attempted”
So you’re backing up your databases, but are you regularly checking that the backups are actually useable? Uphold will help you automatically test them by downloading the backup, decompressing, loading and then running programmatic tests against it that you define to make sure they really have what you need.
This project is very new and subsequently very beta so contributions and pulls are very much welcomed. We have a TODO file with things that we know about that would be awesome if worked on.
In order to make the processes are repeatable as possible all the code and databases are run inside single process Docker containers. There are currently three types of container, the ui
, the tester
and the database itself. Each triggers the next…
uphold-ui
\
-> uphold-tester
\
-> engine-container
This way each time the process is run, the containers are fresh and new, they hold no state. So each time the database is imported into a cold database.
The output of each process run is a log and a state file and these are stored in /var/log/uphold
by default. The UI reads these files to display the state of the runs occurring, no other state is stored in the system.
/var/log/uphold
/var/log/uphold/1453489253_my_db_backup.log
/var/log/uphold/1453489253_my_db_backup_ok
This is the output of a backup run for ‘my_db_backup’ that was started at 1453489253
unix epoch time. The log file contains the full output of the run, and the state file is an empty file, it’s name shows the status of the run…
ok
Backup was declared good, was transported, loaded and tested successfullyok_no_test
Backup was successfully transported and loaded into the DB, but there were no tests to runbad_transport
Transport failedbad_engine
Container did not open it’s port in a timely mannerbad_tests
At least one of the programmatic tests failedbad
An error occurred either in transport or loading into the db engineLogs are not automatically rotated or removed, it is left up to you to decide how long you want to keep them. Once they become compressed, they will disappear from the UI. The same goes for the exited Docker containers of ‘uphold-tester’, they are left on the system incase you wish to inspect them. The database containers however are wiped after they are used.
Most of the installation goes around configuring the tool, you must create the following directory structure on the machine you want to run Uphold on…
/etc/uphold/
/etc/uphold/conf.d/
/etc/uphold/engines/
/etc/uphold/transports/
/etc/uphold/tests/
/var/log/uphold
Create a global config in /etc/uphold/uphold.yml
(even if you leave it empty), the settings inside are…
log_level
(default: DEBUG
)
INFO
, but not recommendedconfig_path
(default: /etc/uphold
)
docker_log_path
(default: /var/log/uphold
)
docker_url
(default: unix:///var/run/docker.sock
)
tcp://example.com:5422
instead (untested)docker_container
(default: forward3d/uphold-tester
)
docker_tag
(default: latest
)
docker_mounts
(default: none
)
local
transport, the folders they exist in need to be mounted into the container. You can specify them here as a YAML array of directories. They will be mounted at the same location inside the containerui_datetime
(default: %F %T %Z
)
If you change the global config you will need to restart the UI docker container, as some settings are only read at launch time.
uphold.yml
Examplelog_level: DEBUG
config_path: /etc/uphold
docker_log_path: /var/log/uphold
docker_url: unix:///var/run/docker.sock
docker_container: forward3d/uphold-tester
docker_tag: latest
docker_mounts:
- /var/my_backups
- /var/my_other_backups
/etc/uphold/conf.d
ExampleEach config is in YAML format, and is constructed of a transport, an engine and tests. In your /etc/uphold/conf.d
directory simply create as many YAML files as you need, one per backup. Configs in this directory are re-read, so you don’t need to restart the UI container if you add new ones.
enabled: true
name: s3-mongo
engine:
type: mongodb
settings:
timeout: 10
database: your_db_name
transport:
type: s3
settings:
region: us-west-2
access_key_id: your-access-key-id
secret_access_key: your-secret-access-key
bucket: your-backups
path: mongodb/systemx/{date}
filename: mongodb.tar
date_format: '%Y.%m.%d'
date_offset: 0
folder_within: mongodb/databases/MongoDB
tests:
- test_structure.rb
- test_data_integrity.rb
enabled
true
or false
, allows you to disable a config if needs bename
See the sections below for how to configure Engines, Transports and Tests.
Transports are how you retrieve the backup file itself. They are also responsible for decompressing the file, the code supports nested compression (compressed files within compressed files). Currently implemented transports are…
Custom transports can also be loaded at runtime if they are placed in /etc/uphold/transports
. If you need extra rubygems installed you will need to create a new Dockerfile with the base set to uphold-tester
and then override the Gemfile and re-bundle. Then adjust your uphold.yml
to use your new container.
Transports all inherit these generic parameters…
path
{date}
, eg. /var/backups/2016-01-21
would be /var/backups/{date}
filename
{date}
, eg. mongodb-2016-01-21.tar
would be mongodb-{date}.tar
date_format
(default: %Y-%m-%d
)
date_offset
(default: 0
)
Date.today
and then subtracts this number, so for checking a backup that exists for yesterday, you would enter 1
folder_within
s3
)The S3 transport allows you to pull your backup files from a bucket in S3. It has it’s own extra settings…
region
us-west-2
)access_key_id
secret_access_key
Paths do not need to be complete with S3, as it provides globbing capability. So if you had a path like this…
my-service-backups/mongodb/2016.01.21.00.36.03/mongodb.tar
Theres no realistic way for us to re-create that date, so you would do this instead…
path: my-service-backups/mongodb/{date}
filename: mongodb.tar
date_format: '%Y.%m.%d'
As the path
is sent to the S3 API as a prefix, it will match all folders, the code then picks the first one it matches correctly. So be aware that not being specific enough with the date_format
could cause the wrong backup to be tested.
transport:
type: s3
settings:
region: us-west-2
access_key_id: your-access-key-id
secret_access_key: your-secret-access-key
bucket: your-backups
path: mongodb/systemx/{date}
filename: mongodb.tar
date_format: '%Y.%m.%d'
date_offset: 0
folder_within: mongodb/databases/MongoDB
local
)The local transport allows you to pull your backup files from the same machine that is running the Docker container. Be aware, since this code runs within a container you will need to add the volume that contains the backup when starting up. We auto-mount /var/uphold
to the same place within the container to reduce confusion.
It has no extra parameters and only uses the generic ones, filename
, path
and folder_within
transport:
type: local
settings:
path: /var/uphold/mongodb
filename: mongodb.tar
folder_within: mongodb/databases/MongoDB
Engines are used to load the backup that was retrieved by the transport into the database. Databases are started inside fresh docker containers each time so no installation is required. Currently supported databases are…
Custom engines can also be loaded at runtime if they are placed in /etc/uphold/engines
Engines all inherit these generic parameters, but are usually significantly easier to configure when compared to transports…
type
mongodb
)database
port
docker_image
docker_tag
timeout
(default: 10
)
mongodb
)Unless you need to change any of the defaults, a standard configuration for MongoDB will look quite small.
engine:
type: mongodb
settings:
database: your_db_name
mongodb
Engine Exampleengine:
type: mongodb
settings:
database: your_db_name
docker_image: mongo
docker_tag: 3.2.1
port: 27017
mysql
)engine:
type: mysql
settings:
database: your_database_name
sql_file: your_sql_file.sql
mysql
Engine Exampleengine:
type: mysql
settings:
database: your_database_name
docker_image: mariadb
docker_tag: 5.5.42
port: 3306
sql_file: MySQL.sql
postgresql
)engine:
type: postgresql
settings:
database: your_database_name
sql_file: your_sql_file.sql
postgresql
Engine ExampleThe database
also becomes your username for when you run the tests.
engine:
type: postgresql
settings:
database: your_database_name
docker_image: postgres
docker_tag: 9.5.0
port: 5432
sql_file: PostgreSQL.sql
Tests are the final step in configuration. They are how you validate that the data contained within your backup is really what you want, and that your backup is operating correctly. Tests are written in Ruby using Minitest, this gives you the most flexibility in writing tests programmatically as it supports both Unit & Spec tests. To configure a test you simply provide an array of ruby files you want to run…
tests:
- test_structure.rb
- test_data_integrity.rb
Tests should be placed within the /etc/uphold/tests
directory, all files inside will be volume mounted into the container so if you need extra files they are available to you.
We need to establish a connection to the database, and the values will not be known in advance. So they will be provided to you by environmental variables UPHOLD_IP
, UPHOLD_PORT
and UPHOLD_DB
. You must use these when connecting to your database.
require 'minitest/autorun'
require 'mongoid'
class TestClients < Minitest::Test
Mongo::Logger.logger.level = Logger::FATAL
@@mongo = Mongo::Client.new("mongodb://#{ENV['UPHOLD_IP']}:#{ENV['UPHOLD_PORT']}/#{ENV['UPHOLD_DB']}")
def test_that_we_can_talk_to_mongo
assert_equal 1, @@mongo.collections.count
end
end
Obviously this is just a simple test, but you can write any number of tests you like. All must pass in order for the backup to be considered ‘good’.
Once you have finished your configuration, to get the system running you only need to start the Docker container called ‘uphold-ui’.
docker pull forward3d/uphold-ui:latest
docker run \
--rm \
-p 8079:8079 \
-v /var/run/docker.sock:/var/run/docker.sock \
-v /etc/uphold:/etc/uphold \
-v /var/log/uphold:/var/log/uphold \
forward3d/uphold-ui:latest
docker.sock
from wherever it resides.Once the container is live you can browse to it, see all previous available runs for a backup and the states the ended in. You can manually start a backup test from here if you want to.
No option to schedule backup runs exists at present. Until one exists you can use the API to trigger backup runs to start. This way you can schedule however you like, crontab, notifier or any other service capable of sending a POST.
GET /api/1.0/backups/config-name-here
This will return all the available backup runs for the config name provided in JSON format…
[
{
"epoch": 1453921377,
"state": "ok",
"filename": "1453921377_s3-mongo.log"
},
{
"epoch": 1453909916,
"state": "ok",
"filename": "1453909916_s3-mongo.log"
}
]
GET /api/1.0/backups/config-name-here/latest
This will return a plain text string of the state of the last backup run for the config name provided. If no runs were available, it will return none
POST /api/1.0/backup
You must pass the name of the config you want to trigger in a form field called name
. It will then start the run and return 200
. An example of how to trigger a backup run for the config named s3-mongo
…
curl --data "name=s3-mongo" http://ip.of.your.container/api/1.0/backup
To aid with development there is a helper script in the root directory called build_and_run
and build_and_inspect
which will build or inspect the Dockerfile and then run it using some default options. Since otherwise testing is a bit of a nightmare when trying to talk to containers on your local machine. Various folders from within the project directory will be auto-mounted into the container…
dev/uphold.yml
-> /etc/uphold/uphold.yml
dev/conf.d
-> /etc/uphold/conf.d
dev/tests
-> /etc/uphold/tests
dev/custom/engines
-> /etc/uphold/engines
dev/custom/transports
-> /etc/uphold/transports
dev/blobs
-> /var/backups
Remember to place a uphold.yml
config of your own in the dev/config
directory.