Re: Pre-alloc ListCell's optimization

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Pre-alloc ListCell's optimization
Date: 2011-05-26 17:14:03
Message-ID: 25997.1306430043@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Stephen Frost <sfrost(at)snowman(dot)net> writes:
> Basically, I added a ListCell array into the List structure and then
> added a bitmap to keep track of which positions in the array were
> filled.

Hm. I've gotten the impression from previous testing that there are an
awful lot of extremely short lists (1 or 2 elements) running around in a
typical query. (One source for those is the argument lists for
operators and functions.) I'm worried that this type of approach would
bloat the storage required in those cases to a degree that would make
the patch unattractive. ISTM the first thing we'd need to have before
we could think about this rationally is some measurements about the
frequencies of different List lengths in a typical workload.

When Neil redid the List infrastructure a few years ago, there was some
discussion of special-casing the very first ListCell, and allocating
just that cell along with the List header. That'd be sort of the
minimal version of what you've done here, and would be guaranteed to
never eat any wasted space (since a list that has a header isn't empty).
We should probably compare the behavior of that minimalistic version to
versions with different sizes of preallocated arrays.

> An alternative approach that I was already considering would be to
> just allocate ListCell's in bulk (kind of a poor-man's slab allocator, I
> believe). To do that we'd have to make the bitmap be a variable length
> array of bitmaps and then have a list of pointers to the ListCell block
> allocations. Seems like that's probably overkill for this, however.

That would be pointing in the direction of trying to save space for very
long Lists, which is a use-case that I'm not sure occurs often enough
for us to be worth spending effort on, and in any case is a distinct
issue from that of saving palloc time for very short Lists. Again, some
statistics about actual list lengths would be really nice to have ...

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2011-05-26 17:28:11 Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum
Previous Message Robert Haas 2011-05-26 17:13:40 Re: LOCK DATABASE