From: | Konstantin Knizhnik <knizhnik(at)garret(dot)ru> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: DSA overflow in hash join |
Date: | 2025-07-27 17:24:19 |
Message-ID: | 44cf2071-495e-4133-a32a-5dad330d1d1d@garret.ru |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I still trying to understand the reason of DSA overflow in hash join.
In addition to two suspicious places where number of buckets is doubled
without chek for overflow (nodeHash.c:1668 and nodeHash.c:3290),
there is one more place where number of batches is multiplied by
`EstimateParallelHashJoinBatch(hashtable)` which is
sizeof(ParallelHashJoinBatch) + (sizeof(SharedTuplestore) +
sizeof(SharedTuplestoreParticipant) * participants) * 2
which is 480 bytes!
But when we calculate maximal number of batches, we limit it by macximal
number of pointers (8 bytes):
max_pointers = hash_table_bytes / sizeof(HashJoinTuple);
max_pointers = Min(max_pointers, MaxAllocSize / sizeof(HashJoinTuple));
/* If max_pointers isn't a power of 2, must round it down to one */
max_pointers = pg_prevpower2_size_t(max_pointers);
/* Also ensure we avoid integer overflow in nbatch and nbuckets */
/* (this step is redundant given the current value of MaxAllocSize) */
max_pointers = Min(max_pointers, INT_MAX / 2 + 1);
dbuckets = ceil(ntuples / NTUP_PER_BUCKET);
dbuckets = Min(dbuckets, max_pointers);
nbuckets = (int) dbuckets;
But as we see, here multiplier is 480 bytes, not 8 bytes.
From | Date | Subject | |
---|---|---|---|
Next Message | Jim Jones | 2025-07-27 18:07:07 | Re: XMLSerialize: version and explicit XML declaration |
Previous Message | Patrick Stählin | 2025-07-27 15:45:35 | Re: Non-blocking archiver process |