RE: speeding up planning with partitions

From: "Kato, Sho" <kato-sho(at)jp(dot)fujitsu(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: speeding up planning with partitions
Date: 2018-09-04 05:45:24
Message-ID: 25C1C6B2E7BE044889E4FE8643A58BA963AF7962@G01JPEXMBKW03
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, Amit. Thank you for your reply.

> I didn't load any data into tables when running the tests, because these patches are meant for reducing planner latency. More specifically, they're addressed to fix the current planner behavior that its latency increases with increasing number of partitions, with focus on the common case where only a single partition will need to be scanned by a given query.

Thank you for telling me details. It is very helpful.

> No need to try to measure the INSERT latency, because its planning latency is not affected by the number of partitions. Moreover, I'd rather suggest you take out the INSERT statement from the benchmark for now, because its execution time does vary unfavorably with increasing number of partitions.

Thank you for your advice.

> In fact, just measure non-prepared tps and latency, because we're only interested in planning time here.

I got it.

> Hmm, since 0001 is meant to improve update planning time, it seems odd that you'd get poorer results compared to base source. But, it seems base source is actually master + the patch to improve the execution time of update, so maybe that patch is playing a part here, although I'm not sure why even that's making this much of a difference.
> I suggest that you use un-patched master as base source, that is, leave out any patches to improve execution time.

> By the way, as you may have noticed, I posted a version 2 of the patches on this thread. If you apply them, you will be be able to see almost same TPS for any number of partitions with master + 0001 + 0002 + 0003.

> I guess if you had applied the latest version of "Speed-up-INSERT-and-UPDATE-on-partitioned-tables.patch" (which is v8 posted at [1]) on top of master + 0001 + 0002 + 0003, you'd get better performance too.

Thank you. I will try out these cases.

Thanks,

--
Sho Kato
-----Original Message-----
From: Amit Langote [mailto:Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp]
Sent: Monday, September 3, 2018 2:12 PM
To: Kato, Sho/加藤 翔 <kato-sho(at)jp(dot)fujitsu(dot)com>; Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: speeding up planning with partitions

Thank you Kato-san for testing.

On 2018/08/31 19:48, Kato, Sho wrote:
> Hi, Amit
>
> Great!
> With the total number of records being 6400, I benchmarked while increasing the number of partitions from 100 to 6400.
> Applying three all patches, 20% performance improved for 100 partitions.
>
> I have the same number of records for each partition, do you do the same?

I didn't load any data into tables when running the tests, because these patches are meant for reducing planner latency. More specifically, they're addressed to fix the current planner behavior that its latency increases with increasing number of partitions, with focus on the common case where only a single partition will need to be scanned by a given query.

I'd try to avoid using a benchmark whose results is affected by anything other than the planning latency. It will be a good idea if you leave the tables empty and just vary the number of partitions and nothing else.

Also, we're interested in planning latency, so using just SELECT and UPDATE in your benchmark script will be enough, because their planning time is affected by the number of partitions. No need to try to measure the INSERT latency, because its planning latency is not affected by the number of partitions. Moreover, I'd rather suggest you take out the INSERT statement from the benchmark for now, because its execution time does vary unfavorably with increasing number of partitions. Sure, there are other patches to address that, but it's better not to mix the patches to avoid confusion.

> Also, in my case, performance was better when not prepare.

Patches in this thread do nothing for the execution, so, there is no need to compare prepared vs non-prepared. In fact, just measure non-prepared tps and latency, because we're only interested in planning time here.

> I think these patches do not improve execute case, so we need faster runtime pruning patch[1], right?

We already have run-time pruning code (that is the code in the patch you
linked) committed into the tree in PG 11, so the master tree also has it.
But since we're not interested in execution time, no need to worry about run-time pruning.

> Details of measurement conditions and results are as follows.
> - base source
> master(@777e6ddf17) + Speeding up Insert v8 patch[1]
>
> - table definition(e.g. 100 partition)
> create table test.accounts(aid serial, abalance int)
> partition by range(aid);
> create table test.accounts_history(id serial, aid int, delta int,
> mtime timestamp without time zone)
> partition by range(aid);
>
> - command option
> pgbench -d testdb -f benchmark.pgbench -T 180 -r -n -M prepare
> pgbench -d testdb -f benchmark.pgbench -T 180 -r -n
>
> -results
> base source no prepared
> part_num | tps_ex | update_latency | select_latency | insert_latency
> ----------+------------+----------------+----------------+----------------
> 100 | 662.414805 | 0.357 | 0.265 | 0.421
> 200 | 494.478431 | 0.439 | 0.349 | 0.579
> 400 | 307.982089 | 0.651 | 0.558 | 0.927
> 800 | 191.360676 | 0.979 | 0.876 | 1.548
> 1600 | 75.344947 | 2.253 | 2.003 | 4.301
> 3200 | 30.643902 | 5.716 | 4.955 | 10.118
> 6400 | 16.726056 | 12.512 | 8.582 | 18.054
>
> 0001 no prepared
> part_num | tps_ex | update_latency | select_latency | insert_latency
> ----------+------------+----------------+----------------+----------------
> 100 | 429.816329 | 0.811 | 0.75 | 0.365
> 200 | 275.211531 | 1.333 | 1.248 | 0.501
> 400 | 160.499833 | 2.384 | 2.252 | 0.754
> 800 | 79.387776 | 4.935 | 4.698 | 1.468
> 1600 | 24.787377 | 16.593 | 15.954 | 4.302
> 3200 | 9.846421 | 42.96 | 42.139 | 8.848
> 6400 | 4.919772 | 87.43 | 83.109 | 16.56

Hmm, since 0001 is meant to improve update planning time, it seems odd that you'd get poorer results compared to base source. But, it seems base source is actually master + the patch to improve the execution time of update, so maybe that patch is playing a part here, although I'm not sure why even that's making this much of a difference.

I suggest that you use un-patched master as base source, that is, leave out any patches to improve execution time.

[ ... ]

> 0001 + 0002 no prepared
> part_num | tps_ex | update_latency | select_latency | insert_latency
> ----------+------------+----------------+----------------+----------------
> 100 | 682.53091 | 0.388 | 0.35 | 0.35
> 200 | 469.906601 | 0.543 | 0.496 | 0.51
> 400 | 321.915349 | 0.78 | 0.721 | 0.752
> 800 | 201.620975 | 1.246 | 1.156 | 1.236
> 1600 | 94.438204 | 2.612 | 2.335 | 2.745
> 3200 | 38.292922 | 6.657 | 5.579 | 6.808
> 6400 | 21.48462 | 11.989 | 10.104 | 12.601
>

[ ... ]

>
> 0001 + 0002 + 0003 no prepared
> part_num | tps_ex | update_latency | select_latency | insert_latency
> ----------+------------+----------------+----------------+----------------
> 100 | 798.962029 | 0.304 | 0.267 | 0.339
> 200 | 577.893396 | 0.384 | 0.346 | 0.487
> 400 | 426.542177 | 0.472 | 0.435 | 0.717
> 800 | 288.616213 | 0.63 | 0.591 | 1.162
> 1600 | 154.247034 | 1.056 | 0.987 | 2.384
> 3200 | 59.711446 | 2.416 | 2.233 | 6.514
> 6400 | 37.109761 | 3.387 | 3.099 | 11.762
>

[ ... ]

By the way, as you may have noticed, I posted a version 2 of the patches on this thread. If you apply them, you will be be able to see almost same TPS for any number of partitions with master + 0001 + 0002 + 0003.

> Although it may not be related to this, when measured with pg11 beta2, somehow the performance was better.
>
> 11beta2 + v1-0001-Speed-up-INSERT-and-UPDATE-on-partitioned-tables.patch[3] prepared
> part_num | tps_ex | update_latency | select_latency | insert_latency
> ----------+-------------+----------------+----------------+----------------
> 100 | 756.07228 | 0.942 | 0.091 | 0.123

I guess if you had applied the latest version of "Speed-up-INSERT-and-UPDATE-on-partitioned-tables.patch" (which is v8 posted at [1]) on top of master + 0001 + 0002 + 0003, you'd get better performance too. But, as mentioned above, we're interested in measuring planning latency, not execution latency, so we should leave out any patches that are meant toward improving execution latency to avoid confusion.

Thanks again.

Regards,
Amit

[1]
https://www.postgresql.org/message-id/CAKJS1f9T_32Xpb-p8cWwo5ezSfVhXviUW8QTWncP8ksPHDRK8g%40mail.gmail.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2018-09-04 06:19:59 Re: pointless check in RelationBuildPartitionDesc
Previous Message Dilip Kumar 2018-09-04 05:19:17 Re: pg_verify_checksums failure with hash indexes