Avoid choose invalid number of partitions (src/backend/executor/nodeAgg.c)

From: Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Avoid choose invalid number of partitions (src/backend/executor/nodeAgg.c)
Date: 2021-06-29 14:32:44
Message-ID: CAEudQAovPCy677Mi=bn0yctggNr=XaUtQWdbLwxqyLWTKq12aQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Not per Coverity.

hash_choose_num_partitions function has issues.
There are at least two path calls made with used_bits = 0.
See at hashagg_spill_init.

To confirm, I run this code on cpp.sh:
int main()
{
int npartitions = 0;
int used_bits = 0;
int partition_bits = 0;
int i;

for(i = 0; i <= 32; i++) {
/* make sure that we don't exhaust the hash bits */
if (partition_bits + used_bits >= 32)
partition_bits = 32 - used_bits;
npartitions = 1L << partition_bits;
printf("used_bits=%d\n", used_bits);
printf("partition_bits=%d\n", partition_bits);
printf("npartitions=%d\n\n", npartitions);
partition_bits++;
}
}

Whose output would be:
used_bits=0
partition_bits=0
npartitions=1

used_bits=0
partition_bits=1
npartitions=2

used_bits=0
partition_bits=2
npartitions=4

used_bits=0
partition_bits=3
npartitions=8

used_bits=0
partition_bits=4
npartitions=16

used_bits=0
partition_bits=5
npartitions=32

used_bits=0
partition_bits=6
npartitions=64

used_bits=0
partition_bits=7
npartitions=128

used_bits=0
partition_bits=8
npartitions=256

used_bits=0
partition_bits=9
npartitions=512

used_bits=0
partition_bits=10
npartitions=1024

used_bits=0
partition_bits=11
npartitions=2048

used_bits=0
partition_bits=12
npartitions=4096

used_bits=0
partition_bits=13
npartitions=8192

used_bits=0
partition_bits=14
npartitions=16384

used_bits=0
partition_bits=15
npartitions=32768

used_bits=0
partition_bits=16
npartitions=65536

used_bits=0
partition_bits=17
npartitions=131072

used_bits=0
partition_bits=18
npartitions=262144

used_bits=0
partition_bits=19
npartitions=524288

used_bits=0
partition_bits=20
npartitions=1048576

used_bits=0
partition_bits=21
npartitions=2097152

used_bits=0
partition_bits=22
npartitions=4194304

used_bits=0
partition_bits=23
npartitions=8388608

used_bits=0
partition_bits=24
npartitions=16777216

used_bits=0
partition_bits=25
npartitions=33554432

used_bits=0
partition_bits=26
npartitions=67108864

used_bits=0
partition_bits=27
npartitions=134217728

used_bits=0
partition_bits=28
npartitions=268435456

used_bits=0
partition_bits=29
npartitions=536870912

used_bits=0
partition_bits=30
npartitions=1073741824

used_bits=0
partition_bits=31
npartitions=-2147483648

used_bits=0
partition_bits=32
npartitions=0

With partition_bits > 24, is very problematic, but with 31 and 32, it
becomes a bug.
With npartitions = -2147483648 and 0, the function hashagg_spill_init,
will generate an operation that is undefined according to the rules of C.
spill->mask = (npartitions - 1) << spill->shift;

On Windows 64 bits (HEAD) fails with partition_prune:
parallel group (11 tests): reloptions hash_part partition_info explain
compression resultcache indexing partition_join partition_aggregate
partition_prune tuplesort
partition_join ... ok 3495 ms
partition_prune ... FAILED 4926 ms

diff -w -U3
C:/dll/postgres/postgres_head/src/test/regress/expected/partition_prune.out
C:/dll/postgres/postgres_head/src/test/regress/results/partition_prune.out
---
C:/dll/postgres/postgres_head/src/test/regress/expected/partition_prune.out
2021-06-23 11:11:26.489575100 -0300
+++
C:/dll/postgres/postgres_head/src/test/regress/results/partition_prune.out
2021-06-29 10:54:43.103775700 -0300
@@ -2660,7 +2660,7 @@
--------------------------------------------------------------------------
Nested Loop (actual rows=3 loops=1)
-> Seq Scan on tbl1 (actual rows=5 loops=1)
- -> Append (actual rows=1 loops=5)
+ -> Append (actual rows=0 loops=5)
-> Index Scan using tprt1_idx on tprt_1 (never executed)
Index Cond: (col1 = tbl1.col1)
-> Index Scan using tprt2_idx on tprt_2 (actual rows=1 loops=2)

With patch attached:
parallel group (11 tests): partition_info hash_part resultcache reloptions
explain compression indexing partition_aggregate partition_join tuplesort
partition_prune
partition_join ... ok 3013 ms
partition_prune ... ok 3959 ms

regards,
Ranier Vilela

Attachment Content-Type Size
avoid_choose_invalid_number_of_partitions.patch application/octet-stream 1.3 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexey Kondratov 2021-06-29 14:34:49 Re: Supply restore_command to pg_rewind via CLI argument
Previous Message Boris Kolpackov 2021-06-29 14:14:46 Re: Pipeline mode and PQpipelineSync()