Easy CLI tool for making zero downtime schema changes and backfills in PostgreSQL
pg-online-schema-change (pg-osc
) is a tool for making schema changes (any ALTER
statements) in Postgres tables with minimal locks, thus helping achieve zero downtime schema changes against production workloads.
pg-osc
uses the concept of shadow table to perform schema changes. At a high level, it creates a shadow table that looks structurally the same as the primary table, performs the schema change on the shadow table, copies contents from the primary table to the shadow table and swaps the table names in the end while preserving all changes to the primary table using triggers (via audit table).
pg-osc
is inspired by the design and workings of tools like pg_repack
and pt-online-schema-change
(MySQL). Read more below on how does it work, prominent features, the caveats and examples
Add this line to your application’s Gemfile:
gem "pg_online_schema_change"
And then execute:
$ bundle install
Or install it yourself as:
$ gem install pg_online_schema_change
This will include all dependencies accordingly as well. Make sure the following requirements are satisfied.
Or via Docker:
docker pull shayonj/pg-osc:latest
https://hub.docker.com/r/shayonj/pg-osc
TRIGGER
and/or a SUPERUSER
pg-online-schema-change help perform
Usage:
pg-online-schema-change perform -a, --alter-statement=ALTER_STATEMENT -d, --dbname=DBNAME -h, --host=HOST -p, --port=N -s, --schema=SCHEMA -u, --username=USERNAME
Options:
-a, --alter-statement=ALTER_STATEMENT # The ALTER statement to perform the schema change
-s, --schema=SCHEMA # The schema in which the table is
# Default: public
-d, --dbname=DBNAME # Name of the database
-h, --host=HOST # Server host where the Database is located
-u, --username=USERNAME # Username for the Database
-p, --port=N # Port for the Database
# Default: 5432
-w, [--password=PASSWORD] # DEPRECATED: Password for the Database. Please pass PGPASSWORD environment variable instead.
-v, [--verbose], [--no-verbose] # Emit logs in debug mode
-f, [--drop], [--no-drop] # Drop the original table in the end after the swap
-k, [--kill-backends], [--no-kill-backends] # Kill other competing queries/backends when trying to acquire lock for the shadow table creation and swap. It will wait for --wait-time-for-lock duration before killing backends and try upto 3 times.
-w, [--wait-time-for-lock=N] # Time to wait before killing backends to acquire lock and/or retrying upto 3 times. It will kill backends if --kill-backends is true, otherwise try upto 3 times and exit if it cannot acquire a lock.
# Default: 10
-c, [--copy-statement=COPY_STATEMENT] # Takes a .sql file location where you can provide a custom query to be played (ex: backfills) when pgosc copies data from the primary to the shadow table. More examples in README.
-b, [--pull-batch-count=N] # Number of rows to be replayed on each iteration after copy. This can be tuned for faster catch up and swap. Best used with delta-count.
# Default: 1000
-e, [--delta-count=N] # Indicates how many rows should be remaining before a swap should be performed. This can be tuned for faster catch up and swap, especially on highly volume tables. Best used with pull-batch-count.
# Default: 20
-o, [--skip-foreign-key-validation], [--no-skip-foreign-key-validation] # Skip foreign key validation after swap. You shouldn't need this unless you have a very specific use case, like manually validating foreign key constraints after swap.
Usage:
pg-online-schema-change --version, -v
print the version
pg-osc
supports when a column is being added, dropped or renamed with no data loss.pg-osc
acquires minimal locks throughout the process (read more below on the caveats).More about the preliminary load test figures here
export PGPASSWORD=""
pg-online-schema-change perform \
--alter-statement 'ALTER TABLE books RENAME COLUMN email TO new_email' \
--dbname "postgres" \
--host "localhost" \
--username "jamesbond" \
export PGPASSWORD=""
pg-online-schema-change perform \
--alter-statement 'ALTER TABLE books ADD COLUMN "purchased" BOOLEAN DEFAULT FALSE; ALTER TABLE books RENAME COLUMN email TO new_email;' \
--dbname "postgres" \
--host "localhost" \
--username "jamesbond" \
--drop
If the operation is being performed on a busy table, you can use pg-osc
’s kill-backend
functionality to kill other backends that may be competing with the pg-osc
operation to acquire a lock for a brief while. The ACCESS EXCLUSIVE
lock acquired by pg-osc
is only held for a brief while and released after. You can tune how long pg-osc
should wait before killing other backends (or if at all pg-osc
should kill backends in the first place).
export PGPASSWORD=""
pg-online-schema-change perform \
--alter-statement 'ALTER TABLE books ADD COLUMN "purchased" BOOLEAN DEFAULT FALSE;' \
--dbname "postgres" \
--host "localhost" \
--username "jamesbond" \
--wait-time-for-lock 5 \
--kill-backends \
--drop
If you have a table with high write volume, the default replay iteration may not suffice. That is - you may see that pg-osc
is replaying 1000 rows (pull-batch-count
) in one go from the audit table. pg-osc
also waits until the remaining row count (delta-count
) in audit table is 20 before making the swap. You can tune these values to be higher for faster catch up on these kind of workloads.
export PGPASSWORD=""
pg-online-schema-change perform \
--alter-statement 'ALTER TABLE books ADD COLUMN "purchased" BOOLEAN DEFAULT FALSE;' \
--dbname "postgres" \
--host "localhost" \
--username "jamesbond" \
--pull-batch-count 2000
--delta-count 500
--wait-time-for-lock 5 \
--kill-backends \
--drop
When inserting data into the shadow table, instead of just copying all columns and rows from the primary table, you can pass in a custom sql file to perform the copy and do any additional work. For instance - backfilling certain columns. By providing the copy-statement
, pg-osc
will instead play the query to perform the copy operation.
IMPORTANT NOTES:
ALTER
statement can change the table’s structure, so proceed with caution.%{shadow_table}
as that will be replaced with the destination of the shadow table.-- file: /src/query.sql
INSERT INTO %{shadow_table}(foo, bar, baz, rental_id, tenant_id)
SELECT a.foo,a.bar,a.baz,a.rental_id,r.tenant_id AS tenant_id
FROM ONLY examples a
LEFT OUTER JOIN rentals r
ON a.rental_id = r.id
pg-online-schema-change perform \
--alter-statement 'ALTER TABLE books ADD COLUMN "tenant_id" VARCHAR;' \
--dbname "postgres" \
--host "localhost" \
--username "jamesbond" \
--copy-statement "/src/query.sql" \
--drop
docker run --network host -it --rm shayonj/pg-osc:latest \
pg-online-schema-change perform \
--alter-statement 'ALTER TABLE books ADD COLUMN "purchased" BOOLEAN DEFAULT FALSE; ALTER TABLE books RENAME COLUMN email TO new_email;' \
--dbname "postgres" \
--host "localhost" \
--username "jamesbond" \
--drop
pg-osc
will raise an exception
pg-osc
will acquire ACCESS EXCLUSIVE
lock on the parent table twice during the operation.
kill-backends
is passed, it will attempt to terminate any competing operations during both times.pg-osc
doesn’t kill any other DDLs being performed. It’s best to not run any DDLs against the parent table during the operation.NOT VALID
. A follow on VALIDATE CONSTRAINT
is run.
NOT VALID
.ACCESS EXCLUSIVE
lock to add a trigger on the parent table (for inserts, updates, deletes) to the audit table.ACCESS EXCLUSIVE
lock against the parent table within a transaction and:
NOT VALID
.ANALYZE
on the new table.NOT VALID
.\curl -sSL https://get.rvm.io | bash
rvm install 3.3.0
rvm use 3.3.0
docker compose up
bundle exec rspec
to run the tests.bin/console
for an interactive prompt that will allow you to experiment.To install this gem onto your local machine, run bundle exec rake install
.
docker compose up
pgbench --initialize -s 10 --foreign-keys --host localhost -U jamesbond -d postgres
pgbench -T 60000 -c 5 --host localhost -U jamesbond -d postgres
bundle exec bin/pg-online-schema-change perform -a 'ALTER TABLE pgbench_accounts ALTER COLUMN aid TYPE BIGINT' -d "postgres" -h "localhost" -u "jamesbond" -w "password"
version.rb
./scripts/release.sh 0.2.0
CHANGELOG.md
Bug reports and pull requests are welcome on GitHub at https://github.com/shayonj/pg-osc.