Re: Removing "long int"-related limit on hash table sizes

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Removing "long int"-related limit on hash table sizes
Date: 2021-07-25 16:28:04
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> On 2021-07-23 17:15:24 -0400, Tom Lane wrote:
>> That's because they spill to disk where they did not before. The easy
>> answer of "raise hash_mem_multiplier" doesn't help, because on Windows
>> the product of work_mem and hash_mem_multiplier is clamped to 2GB,
>> thanks to the ancient decision to do a lot of memory-space-related
>> calculations in "long int", which is only 32 bits on Win64.

> We really ought to just remove every single use of long.

I have no objection to that as a long-term goal. But I'm not volunteering
to do all the work, and in any case it wouldn't be a back-patchable fix.
I feel that we do need to do something about this performance regression
in v13.

> Hm. I wonder if we would avoid some overflow dangers on 32bit systems if
> we made get_hash_memory_limit() and the relevant variables 64 bit,
> rather than 32bit / size_t. E.g.

No, I don't like that. Using size_t for memory-size variables is good
discipline. Moreover, I'm not convinced that even with 64-bit ints,
overflow would be impossible in all the places I fixed here. They're
multiplying several potentially very large values (one of which
is a float). I think this is just plain sloppy coding, independently
of which bit-width you choose to be sloppy in.

>> + skew_mcvs = skew_mcvs * SKEW_HASH_MEM_PERCENT / 100;

> I always have to think about the evaluation order of things like this
> (it's left to right for these), so I'd prefer parens around the
> multiplication. I did wonder briefly whether the SKEW_HASH_MEM_PERCENT /
> 100 just evaluates to 0...

OK, will do. I see your point, because I'd sort of instinctively
wanted to write that as
skew_mcvs *= SKEW_HASH_MEM_PERCENT / 100;
which of course would not work.

Thanks for looking at the code.

regards, tom lane

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2021-07-25 16:36:37 Re: Planning counters in pg_stat_statements (using pgss_store)
Previous Message Yura Sokolov 2021-07-25 16:07:18 Re: [PoC] Improve dead tuple storage for lazy vacuum