Delay locking partitions during query execution

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Delay locking partitions during query execution
Date: 2018-12-03 11:42:31
Message-ID: CAKJS1f_kfRQ3ZpjQyHC7=PK9vrhxiHBQFZ+hc0JCwwnRKkF3hg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Over on [1] I'm proposing to delay locking partitions of a partitioned
table that's the target of an INSERT or UPDATE command until we first
route a tuple to the partition. Currently, we go and lock all
partitions, even if we just insert a single tuple to a single
partition. The patch in [1] improves the situation when there are many
partitions and only a few tuples to route to just to a few partitions.

Over here and along similar lines to the above, but this time I'd like
to take this even further and change things so we don't lock *any*
partitions during AcquireExecutorLocks() and instead just lock them
when we first access them with ExecGetRangeTableRelation(). This
improves the situation when many partitions get run-time pruned, as
we'll never bother locking those at all since we'll never call
ExecGetRangeTableRelation() on them. We'll of course still lock the
partitioned table so that plan invalidation works correctly.

This does make the locking order less well defined, but I'm already
proposing similar in [1] and over there I've mentioned that I can't
quite see any huge issues with doing that. We already don't lock all
partitions inside AcquireExecutorLocks() during INSERT with VALUES
anyway.

The attached patch implements this delayed locking. A quick benchmark
shows a pretty good performance improvement when there are a large
number of partitions and run-time pruning prunes almost all of them.

Setup:
create table hashp (a int) partition by hash(a);
select 'create table hashp'|| x::text || ' partition of hashp for
values with (modulus 10000, remainder ' || x ::text || ');' from
generate_Series(0,9999) x;
\gexec
insert into hashp select generate_Series(1,1000000)

bench.sql:
\set p_a 13315
select * from hashp where a = :p_a;

Master: 10000 parts

$ pgbench -n -f bench.sql -M prepared -T 60 postgres
tps = 108.882749 (excluding connections establishing)
tps = 108.245437 (excluding connections establishing)

delaylock: 10000 parts

$ pgbench -n -f bench.sql -M prepared -T 60 postgres
tps = 1068.289505 (excluding connections establishing)
tps = 1092.797015 (excluding connections establishing)

More could be done to make this quite a bit faster again, but that
mostly requires the range table coming directly from the planner as an
array and marking which array elements require locking with a
Bitmapset. This'll save having to loop over the entire large array
that mostly does not need anything locked. Similar can be done for
ExecCheckRTPerms(), but that's also for another day. With those
changes and some further tweaks done on the Append/MergeAppend code,
tps is about 22k on my machine, which is just slightly slower than
with an equivalent non-partitioned table.

I'll add the attached patch to the January commitfest

[1] https://www.postgresql.org/message-id/CAKJS1f-%3DFnMqmQP6qitkD%2BxEddxw22ySLP-0xFk3JAqUX2yfMw%40mail.gmail.com

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
v1-0001-Allow-lock-acquisitions-for-partitions-to-be-dela.patch application/octet-stream 12.8 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ildus Kurbangaliev 2018-12-03 12:43:32 Re: [HACKERS] Custom compression methods
Previous Message Dave Cramer 2018-12-03 11:38:43 Re: Reviving the "Stopping logical replication protocol" patch from Vladimir Gordichuk