Re: speeding up planning with partitions

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Imai Yoshikazu <yoshikazu_i443(at)live(dot)jp>, "jesper(dot)pedersen(at)redhat(dot)com" <jesper(dot)pedersen(at)redhat(dot)com>, "Imai, Yoshikazu" <imai(dot)yoshikazu(at)jp(dot)fujitsu(dot)com>, "'Amit Langote'" <amitlangote09(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: speeding up planning with partitions
Date: 2019-03-22 02:17:18
Message-ID: d04c7027-2df8-bbd5-9a8f-9806e3ab3270@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jesper, Imai-san,

Thanks for testing and reporting your findings.

On 2019/03/21 23:10, Imai Yoshikazu wrote:
> On 2019/03/20 23:25, Jesper Pedersen wrote:> Hi,
> > My tests - using hash partitions - shows that the extra time is spent in
> > make_partition_pruneinfo() for the relid_subplan_map variable.
> >
> > 64 partitions: make_partition_pruneinfo() 2.52%
> > 8192 partitions: make_partition_pruneinfo() 5.43%
> >
> > TPS goes down ~8% between the two. The test setup is similar to the
> above.
> >
> > Given that Tom is planning to change the List implementation [1] in 13 I
> > think using the palloc0 structure is ok for 12 before trying other
> > implementation options.

Hmm, relid_subplan_map's size should be constant (number of partitions
scanned) even as the actual partition count grows.

However, looking into make_partitionedrel_pruneinfo(), it seems that it's
unconditionally allocating 3 arrays that all have nparts elements:

subplan_map = (int *) palloc0(nparts * sizeof(int));
subpart_map = (int *) palloc0(nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(int));

So, that part has got to cost more as the partition count grows.

This is the code for runtime pruning, which is not exercised in our tests,
so it might seem odd that we're spending any time here at all. I've been
told that we have to perform at least *some* work here if only to conclude
that runtime pruning is not needed and it seems that above allocations
occur before making that conclusion. Maybe, that's salvageable by
rearranging this code a bit. David may be in a better position to help
with that.

> Thanks for testing.
> Yeah, after code looking, I think bottleneck seems be lurking in another
> place where this patch doesn't change. I also think the patch is ok as
> it is for 12, and this problem will be fixed in 13.

If this drop in performance can be attributed to the fact that having too
many partitions starts stressing other parts of the backend like stats
collector, lock manager, etc. as time passes, then I agree that we'll have
to tackle them later.

Thanks,
Amit

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2019-03-22 02:38:16 Re: jsonpath
Previous Message Andrey Borodin 2019-03-22 02:15:59 Re: Special role for subscriptions