Re: Bad COMPACT_ALLOC_CHUNK size in tsearch/spell.c?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Bad COMPACT_ALLOC_CHUNK size in tsearch/spell.c?
Date: 2011-04-26 17:41:04
Message-ID: 9785.1303839664@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
> On Tue, Apr 26, 2011 at 11:46 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> After a bit longer, the reasoning came back to me ...

> right -- spell.c was deliberately trying to influence allocator
> behavior. Should it really do that though without direct
> participation of some form? (like, exposing the allocation chunk size
> macro?) It seems a bit dodgy as it stands...

Yeah, it's at the very least inadequately documented, because after
further poking around I remembered that there's a reason why the comment
says allocChunkLimit and not ALLOC_CHUNK_LIMIT. Namely, that the
dictionary context that this stuff is being built in uses
ALLOCSET_SMALL_MAXSIZE which is only 8K, causing the active value of
allocChunkLimit to be less than that, which means the code actually *is*
operating as designed. The chunks it gets will be separate malloc
blocks.

This means the relevant space-wastage danger is not the one I explained
earlier, but something else entirely. There isn't a risk from wasting
leftover space in large malloc requests, because the small maxBlockSize
specified for dictionary contexts prevents that. But suppose for
example that spell.c were using 4K instead of 8K here. Because of
per-chunk overhead, those requests would consume just over half of each
malloc block made by aset.c, and it would never enlarge them because of
the small maxBlockSize. So 4K requests would result in wasting almost
half the space in each malloc block. Using a larger value guards
against that behavior.

So: code does what it's supposed to, but the comment sucks, and there's
definitely a lot of action-at-a-distance here considering that the
memory context creation with the use of ALLOCSET_SMALL_MAXSIZE is in
still a third file.

We could certainly stand to improve the comment. I'm not sure whether
there's any good way to expose these considerations more directly in
the code. Any ideas?

A different angle of attack on the issue is that aset.c's use of
exact-power-of-2 sizes for both malloc requests and the available space
in chunks is inefficient when maxBlockSize is constrained to be not much
larger than common chunk request sizes. Maybe we should try to fix that
instead of asking other code to choose request sizes to dodge the
inefficiency.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2011-04-26 17:41:27 Re: Prefered Types
Previous Message Sim Zacks 2011-04-26 17:15:50 Re: Proposal - asynchronous functions