Re: hyrax vs. RelationBuildPartitionDesc

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: hyrax vs. RelationBuildPartitionDesc
Date: 2019-03-14 22:08:08
Message-ID: 15943.1552601288@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Thu, Mar 14, 2019 at 12:36 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Hmm, I wonder why not. I suppose the answer is that
>> the leak is worse in HEAD than before, but how come?

> I'd like to know that, too, but right now I don't.

I poked at this some more, by dint of adding some memory context usage
tracking and running it under CLOBBER_CACHE_ALWAYS --- specifically,
I looked at the update.sql test script, which is one of the two that
is giving hyrax trouble. (Conveniently, that one runs successfully
by itself, without need for the rest of the regression suite.)

I found that even with 2455ab488, there still seemed to be an
unreasonable amount of MessageContext bloat during some of the queries,
on the order of 50MB in some of them. Investigating more closely,
I found that 2455ab488 has a couple of simple oversights: it's still
calling partition_bounds_create in the caller's context, allowing
whatever that builds or leaks to be a MessageContext leak. And it's
still calling get_rel_relkind() in the rd_pdcxt context, potentially
leaking a *lot* of stuff into that normally-long-lived context, since
that will result in fresh catcache (and thence potentially relcache)
loads in some cases.

But fixing that didn't seem to move the needle very much for update.sql.
With some further investigation, I identified a few other main
contributors to the bloat:

RelationBuildTriggers
RelationBuildRowSecurity
RelationBuildPartitionKey

Are you noticing a pattern yet? The real issue here is that we have
this its-okay-to-leak-in-the-callers-context mindset throughout
relcache.c, and when CLOBBER_CACHE_ALWAYS causes us to reload relcache
entries a lot, that adds up fast. The partition stuff makes this worse,
I think, primarily just because it increases the number of tables we
touch in a typical query.

What I'm thinking, therefore, is that 2455ab488 had the right idea but
didn't take it far enough. We should remove the temp-context logic it
added to RelationBuildPartitionDesc and instead put that one level up,
in RelationBuildDesc, where the same temp context can serve all of these
leak-prone sub-facilities.

Possibly it'd make sense to conditionally compile this so that we only
do it in a CLOBBER_CACHE_ALWAYS build. I'm not very sure about that,
but arguably in a normal build the overhead of making and destroying
a context would outweigh the cost of the leaked memory. The main
argument I can think of for doing it all the time is that having memory
allocation work noticeably differently in CCA builds than normal ones
seems like a recipe for masking normal-mode bugs from the CCA animals.
But that's only a guess not proven fact; it's also possible that having
two different behaviors would expose bugs we'd otherwise have a hard
time detecting, such as continuing to rely on the "temporary" data
structures longer than we should.

(This is all independent of the other questions raised on this and
nearby threads.)

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nikita Glukhov 2019-03-14 22:11:58 Re: [PATCH] kNN for btree
Previous Message Tomas Vondra 2019-03-14 21:10:16 seems like a bug in pgbench -R