Skip site navigation (1) Skip section navigation (2)

Re: rbtree code breaks GIN's adherence to maintenance_work_mem

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: rbtree code breaks GIN's adherence to maintenance_work_mem
Date: 2010-07-31 16:12:30
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-hackers
On Sat, Jul 31, 2010 at 12:02 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Sat, Jul 31, 2010 at 12:40 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> So, I would like somebody to show cause why that whole module shouldn't
>>> be ripped out and the code reverted to where it was in 8.4.  My
>>> recollection is that the argument for adding it was to speed things up
>>> in corner cases, but what I think it's actually going to do is slow
>>> things down in every case.
>> I've always been a bit suspicious of this code, too, even though I
>> didn't think about the memory consumption issue.  But see here:
> I did a bit of experimentation and confirmed my fears: HEAD is willing
> to eat about double the specified maintenance_work_mem.  If you cut
> back the setting so that its actual memory use is no more than 8.4's,
> it's about 33% slower on non-pathological data (I'm testing the dataset
> from Artur Dabrowski here).

That seems like a pretty serious regression.

> I'm tempted to suggest that making RBNode be a hidden struct containing
> a pointer to somebody else's datum is fundamentally the wrong way to
> go about things, because the extra void pointer is pure overhead,
> and we aren't ever going to be using these things in a context where
> memory usage isn't of concern.  If we refactored the API so that RBNode
> was intended to be the first field of some larger struct, as is done in
> dynahash tables for instance, we could eliminate the void pointer and
> the palloc inefficiency.  The added storage compared to what 8.4 used
> would be a parent link and the iteratorState/color fields, which would
> end up costing us 16 more bytes per EntryAccumulator rather than 64.
> Still not great but at least it's not a 2X penalty, and the memory
> allocation would become the caller's problem not rbtree's, so the
> problem of tracking usage would be no different from before.

Even if we do that, is it still going to be too much of a performance
regression overall?

Robert Haas
The Enterprise Postgres Company

In response to


pgsql-hackers by date

Next:From: Tom LaneDate: 2010-07-31 16:32:03
Subject: Re: rbtree code breaks GIN's adherence to maintenance_work_mem
Previous:From: Tom LaneDate: 2010-07-31 16:06:33
Subject: Re: ANALYZE versus expression indexes with nondefault opckeytype

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group