RE: speeding up planning with partitions

From: "Imai, Yoshikazu" <imai(dot)yoshikazu(at)jp(dot)fujitsu(dot)com>
To: 'David Rowley' <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>
Subject: RE: speeding up planning with partitions
Date: 2019-03-15 01:38:22
Message-ID: 0F97FA9ABBDBE54F91744A9B37151A5129E65D@g01jpexmbkw24
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, David

On Thu, Mar 14, 2019 at 9:04 AM, David Rowley wrote:
> On Thu, 14 Mar 2019 at 21:35, Imai, Yoshikazu
> <imai(dot)yoshikazu(at)jp(dot)fujitsu(dot)com> wrote:
> > 0007:
> > * This changes some processes using "for loop" to using
> "while(bms_next_member())" which speeds up processing when we scan few
> partitions in one statement, but when we scan a lot of partitions in one
> statement, its performance will likely degraded. I measured the
> performance of both cases.
> > I executed select statement to the table which has 4096 partitions.
> >
> > [scanning 1 partition]
> > Without 0007 : 3,450 TPS
> > With 0007 : 3,723 TPS
> >
> > [scanning 4096 partitions]
> > Without 0007 : 10.8 TPS
> > With 0007 : 10.5 TPS
> >
> > In the above result, performance degrades 3% in case of scanning 4096
> partitions compared before and after applying 0007 patch. I think when
> scanning a lot of tables, executor time would be also longer, so the
> increasement of planner time would be relatively smaller than it. So we
> might not have to care this performance degradation.
>
> I think it's better to focus on the fewer partitions case due to the fact
> that execution initialisation time and actual execution are likely to
> take much longer when more partitions are scanned. I did some work on
> run-time pruning to tune it for this case. Tom did make a similar argument
> in [1] and I explained my reasoning in [2].

Thanks for quoting these threads.
Actually, I recalled this argument, so I tested this just to make sure.

> bms_next_member has gotten a good performance boost since then and the
> cases are not exactly the same since the old version the loop in run-time
> pruning checked bms_is_member(), but the fact is, we did end up tuning
> for the few partitions case in the end.

Wow, I didn't know that.

> However, it would be good to see the performance results for
> plan+execution time of say a table with 4k parts looking up a single
> indexed value. You could have two columns, one that's the partition key
> which allows the pruning to take place, and one that's not and results
> in scanning all partitions. I'll be surprised if you even notice the
> difference between with and without 0007 with the latter case.

So I tested for checking the performance for plan+execution time.

[set up table and indexes]
create table rt (a int, b int) partition by range (a);
\o /dev/null
select 'create table rt' || x::text || ' partition of rt for values from (' ||
(x)::text || ') to (' || (x+1)::text || ');' from generate_series(1, 4096) x;
\gexec
\o

create index b_idx on rt (b);

insert into rt select a, b from generate_series(1, 4096) a, generate_series(1, 1000) b;

[select_indexed_values.sql]
\set b random(1, 1000)
select count(*) from rt where b = :b;

[pgbench]
pgbench -n -f select_indexed_values.sql -T 60 postgres

[results]
Without 0007: 3.18 TPS (3.25, 3.13, 3.15)
With 0007: 3.21 TPS (3.21, 3.23, 3.18)

From the results, we didn't see the performance degradation in this case. Actually, the performance increased 1% before and after applying 0007, but it would be just an measurement error.
So, generally, we can think the performance difference of bms_next_member and for loop can be absorbed by other processing(execution initialisation and actual execution) when scanning many partitions.

--
Yoshikazu Imai

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tsunakawa, Takayuki 2019-03-15 02:08:34 RE: Timeout parameters
Previous Message Kyotaro HORIGUCHI 2019-03-15 00:51:57 Re: Index Skip Scan