Preventing deadlock on parallel backup

From: Lucas <lucas75(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Preventing deadlock on parallel backup
Date: 2016-09-08 18:41:11
Message-ID: CAEWGB6_tbkq6UpmHBjZu6rTEC+p0OLzCuWn-SeOv_ENizadx9Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

People,

I made a small modification in pg_dump to prevent parallel backup failures
due to exclusive lock requests made by other tasks.

The modification I made take shared locks for each parallel backup worker
at the very beginning of the job. That way, any other job that attempts to
acquire exclusive locks will wait for the backup to finish.

In my case, each server was taking a day to complete the backup, now with
parallel backup one is taking 3 hours and the others less than a hour.

The code below is not very elegant, but it works for me. My whishlist for
the backup is:

1) replace plpgsql by c code reading the backup toc and assembling the lock
commands.
2) create an timeout to the locks.
3) broadcast the end of copy to every worker in order to release the locks
as early as possible;
4) create a monitor thread that prioritize an copy job based on a exclusive
lock acquired;
5) grant the lock for other connection of the same distributed transaction
if it is held by any connection of the same distributed transaction. There
is some sideefect I can't see on that?

1 to 4 are within my capabilities and I may do it in the future. 4 is to
advanced for me and I do not dare to mess with something so fundamental
rights now.

Anyone else is working on that?

On, Parallel.c, void RunWorker(...), add:

PQExpBuffer query;
PGresult *res;

query = createPQExpBuffer();
resetPQExpBuffer(query);
appendPQExpBuffer(query,
"do language 'plpgsql' $$"
" declare "
" x record;"
" begin"
" for x in select * from pg_tables where schemaname not in
('pg_catalog','information_schema') loop"
" raise info 'lock table %.%', x.schemaname, x.tablename;"
" execute 'LOCK TABLE
'||quote_ident(x.schemaname)||'.'||quote_ident(x.tablename)||' IN ACCESS
SHARE MODE NOWAIT';"
" end loop;"
"end"
"$$" );

res = PQexec(AH->connection, query->data);

if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
exit_horribly(modulename,"Could not lock the tables to begin the
work\n\n");
PQclear(res);
destroyPQExpBuffer(query);

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-09-08 18:57:40 Re: Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
Previous Message Stephen Frost 2016-09-08 18:35:44 Re: Add support for restrictive RLS policies