Early locking option to parallel backup

From: Lucas <lucas75(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Early locking option to parallel backup
Date: 2017-11-05 10:17:47
Message-ID: CAEWGB6-KN-vLn__asyMfqoN84SWxb3P=Vzy7iK6LUd0F6wLHew@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello people,

I am sending a patch to improve parallel backup on larger databases.

THE PROBLEM

pg_dump was taking more than 24 hours to complete in one of my databases. I
begin to research alternatives. Parallel backup reduced the backup time to
little less than a hour, but it failed almost every time because of
concurrent queries that generated exclusive locks. It is difficult to
guarantee that my applications will not issue queries such as drop table,
alter table, truncate table, create index or drop index for a hour. And I
prefer not to create controls mechanisms to that end if I can work around
it.

THE SOLUTION

The patch creates a "--lock-early" option which will make pg_dump to issue
shared locks on all tables on the backup TOC on each parallel worker start.
That way, the backup has a very small chance of failing. When it does,
happen in the first few seconds of the backup job. My backup scripts (not
included here) are aware of that and retries the backup in case of failure.

TESTS
I am using this technique in production over a year now and it is working
well for me.

Lucas

Attachment Content-Type Size
pg_dump-lock-early.patch text/x-patch 4.1 KB
test.zip application/zip 868 bytes

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-11-05 12:17:12 Re: Early locking option to parallel backup
Previous Message Connor Wolf 2017-11-05 08:09:51 Re: How to implement a SP-GiST index as a extension module?