Re: Delay locking partitions during query execution

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Delay locking partitions during query execution
Date: 2019-01-04 00:53:18
Message-ID: CAKJS1f_iVpfwYwKj3QrJxHBomDifY4_b6-yo89wMy=n9D6skyQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 4 Jan 2019 at 13:01, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> On 1/3/19 11:57 PM, David Rowley wrote:
> > You'll know you're getting a generic plan when you see "Filter (a =
> > $1)" and see "Subplans Removed: 9999" below the Append.
> >
>
> Indeed, with prepared statements I now see some improvements:
>
> partitions 0 100 1000 10000
> --------------------------------------------
> master 19 1590 2090 128
> patched 18 1780 6820 1130
>
> So, that's nice. I wonder why the throughput drops so fast between 1k
> and 10k partitions, but I'll look into that later.

Those look strange. Why is it so slow with the non-partitioned case?
I'd have expected that to be the fastest result.

> Does this mean this optimization can only ever work with prepared
> statements, or can it be made to work with regular plans too?

That's a good question. I confirm it's only any use of run-time
pruning occurs during init plan. For this patch to be any use, you'll
need to see a "Subplans Removed: <N>" in the query's EXPLAIN ANALYZE
output. If you don't see this then all Append/MergeAppend subplans
were initialised and the relation lock would have been obtained
regardless of if delaylock is set for the relation. The effect of the
patch here would just have been to obtain the lock during the first
call to ExecGetRangeTableRelation() for that relation instead of
during AcquireExecutorLocks(). There may actually be a tiny overhead
in this case since AcquireExecutorLocks() must skip the delaylock
relations, but they'll get locked later anyway. I doubt you could
measure that though.

When run-time pruning is able to prune partitions before execution
starts then the optimisation is useful since AcquireExecutorLocks()
won't obtain the lock and ExecGetRangeTableRelation() won't be called
for all pruned partition's rels as we don't bother to init the
Append/MergeAppend subplan for those.

I'm a little unsure if there are any cases where this type of run-time
pruning can occur when PREPAREd statements are not in use. Initplan
parameters can't prune before executor run since we need to run the
executor to obtain the values of those. Likewise for evaluation of
volatile functions. So I think run-time pruning before initplan is
only ever going to happen for PARAM_EXTERN type parameters, i.e. with
PREPAREd statements (REF: analyze_partkey_exprs() partprune.c).
Without PREPAREd statements, if the planner itself was unable to prune
the partitions it would already have obtained the lock during
planning, so AcquireExecutorLocks(), in this case, would bump into the
local lock hash table entry and forego trying to obtain the lock
itself. That's not free, but it's significantly faster than obtaining
a lock.

Or in short... it only good for prepared statements where the
statement's parameters allow for run-time pruning. However, that's a
pretty large case since the planner is still very slow at planning for
large numbers of partitions, meaning it's common (or at least it will
be) for people to use PREPAREd statement and plan_cache_mode =
force_generic_plan;

> >> Furthermore, I've repeatedly ran into this issue:
> >>
> >> test=# \d hashp
> >> ERROR: unrecognized token: "false"
> >> LINE 2: ...catalog.array_to_string(array(select rolname from pg_catalog...
> >> ^
> >> I have no idea why it breaks like this, and it's somewhat random (i.e.
> >> not readily reproducible). But I've only ever seen it with this patch
> >> applied.
> >
> > You'll probably need to initdb with the patch applied as there's a new
> > field in RangeTblEntry. If there's a serialised one of these stored in
> > the in the catalogue somewhere then the new read function will have
> > issues reading the old serialised format.
> >
>
> D'oh! That explains it, because switching from/to patched binaries might
> have easily been triggering the error. I've checked that there are no
> changes to catalogs, but it did not occur to me adding a new RTE field
> could have such consequences ...

schema-wise, no changes, but data-wise, there are changes.

$ pg_dump --schema=pg_catalog --data-only postgres | grep ":rellockmode" | wc -l
121

All of which are inside the pg_rewrite table:

$ pg_dump --schema=pg_catalog --data-only --table=pg_rewrite postgres
| grep ":rellockmode" | wc -l
121

I just used ":rellockmode" here as it's a field that exists in RangeTblEntry.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mithun Cy 2019-01-04 02:53:00 Re: WIP: Avoid creation of the free space map for small tables
Previous Message Tom Lane 2019-01-04 00:49:17 Re: inconsistency and inefficiency in setup_conversion()