Re: hyrax vs. RelationBuildPartitionDesc

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: hyrax vs. RelationBuildPartitionDesc
Date: 2019-04-17 09:58:46
Message-ID: 10a73786-37c3-c9a0-84de-de551064f739@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2019/04/15 4:29, Tom Lane wrote:
> I think that what we ought to do for v12 is have PartitionDirectory
> copy the data, and then in v13 work on creating real reference-count
> infrastructure that would allow eliminating the copy steps with full
> safety. The $64 question is whether that really would cause unacceptable
> performance problems. To look into that, I made the attached WIP patches.
> (These are functionally complete, but I didn't bother for instance with
> removing the hunk that 898e5e329 added to relcache.c, and the comments
> need work, etc.) The first one just changes the PartitionDirectory
> code to do that, and then the second one micro-optimizes
> partition_bounds_copy() to make it somewhat less expensive, mostly by
> collapsing lots of small palloc's into one big one.

Thanks for the patches. The partition_bound_copy()-micro-optimize one
looks good in any case.

> What I get for test cases like [1] is
>
> single-partition SELECT, hash partitioning:
>
> N tps, HEAD tps, patch
> 2 11426.243754 11448.615193
> 8 11254.833267 11374.278861
> 32 11288.329114 11371.942425
> 128 11222.329256 11185.845258
> 512 11001.177137 10572.917288
> 1024 10612.456470 9834.172965
> 4096 8819.110195 7021.864625
> 8192 7372.611355 5276.130161
>
> single-partition SELECT, range partitioning:
>
> N tps, HEAD tps, patch
> 2 11037.855338 11153.595860
> 8 11085.218022 11019.132341
> 32 10994.348207 10935.719951
> 128 10884.417324 10532.685237
> 512 10635.583411 9578.108915
> 1024 10407.286414 8689.585136
> 4096 8361.463829 5139.084405
> 8192 7075.880701 3442.542768
>
> Now certainly these numbers suggest that avoiding the copy could be worth
> our trouble, but these results are still several orders of magnitude
> better than where we were two weeks ago [2]. Plus, this is an extreme
> case that's not really representative of real-world usage, since the test
> tables have neither indexes nor any data.

I tested the copyPartitionDesc() patch and here are the results for
single-partition SELECT using hash partitioning, where index on queries
column, and N * 1000 rows inserted into the parent table before the test.
I've confirmed that the plan is always an Index Scan on selected partition
(in PG 11, it's under Append, but in HEAD there's no Append due to 8edd0e794)

N tps, HEAD tps, patch tps, PG 11
2 3093.443043 3039.804101 2928.777570
8 3024.545820 3064.333027 2372.738622
32 3029.580531 3032.755266 1417.706212
128 3019.359793 3032.726006 567.099745
512 2948.639216 2986.987862 98.710664
1024 2971.629939 2882.233026 41.720955
4096 2680.703000 1937.988908 7.035816
8192 2599.120308 2069.271274 3.635512

So, the TPS degrades by 14% when going from 128 partitions to 8192
partitions on HEAD, whereas it degrades by 31% with the patch.

Here are the numbers with no indexes defined on the tables, and no data
inserted.

N tps, HEAD tps, patch tps, PG 11
2 3498.247862 3463.695950 3110.314290
8 3524.430780 3445.165206 2741.340770
32 3476.427781 3427.400879 1645.602269
128 3427.121901 3430.385433 651.586373
512 3394.907072 3335.842183 182.201349
1024 3454.050819 3274.266762 67.942075
4096 3201.266380 2845.974556 12.320716
8192 2955.850804 2413.723443 6.151703

Here, the TPS degrades by 13% when going from 128 partitions to 8192
partitions on HEAD, whereas it degrades by 29% with the patch.

So, the degradation caused by copying the bounds is almost same in both
cases. Actually, even in the more realistic test with indexes and data,
executing the plan is relatively faster than planning as the partition
count grows, because the PartitionBoundInfo that the planner now copies
grows bigger.

Thanks,
Amit

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2019-04-17 10:00:30 Re: hyrax vs. RelationBuildPartitionDesc
Previous Message Zhang, Jie 2019-04-17 08:05:32 [patch] pg_test_timing does not prompt illegal option