Re: hyrax vs. RelationBuildPartitionDesc

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: hyrax vs. RelationBuildPartitionDesc
Date: 2019-03-14 17:18:50
Message-ID: CA+TgmoYg2NY3KTUFiuaCxDkW8iXD3o3H3wdXs6m-o-dGiYHD_Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 14, 2019 at 12:36 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Hmm, I wonder why not. I suppose the answer is that
> the leak is worse in HEAD than before, but how come?

I'd like to know that, too, but right now I don't.

> I followed your reference to 898e5e329, and I've got to say that
> the hunk it adds in relcache.c looks fishy as can be. The argument
> that the rd_pdcxt "will get cleaned up eventually" reads to me like
> "this leaks memory like a sieve", especially in the repeated-rebuild
> scenario which is exactly what CLOBBER_CACHE_ALWAYS would provoke.
> Probably the only thing that keeps it from being effectively a
> session-lifespan leak is that CCA will generally result in relcache
> entries being flushed entirely as soon as their refcount goes to 0.
> Still, because of that, I wouldn't think it'd move the needle very
> much on a CCA animal; so my guess is that there's something else.

I'm a little uncertain of that logic, too, but keep in mind that if
keep_partdesc is true, we're just going to throw away the newly-build
data and keep the old stuff. So I think that in order for this to
actually be a problem, you would have to have other sessions that
repeatedly alter the partitiondesc via ATTACH PARTITION, and at the
same time, the current session would have to keep on reading it. But
you also have to never add a partition while the local session is
between queries, because if you do, rebuild will be false and we'll
blow away the old entry in its entirety and any old contexts that are
hanging off of it will be destroyed as well. So it seems a little
ticklish to me to figure out a realistic scenario in which we actually
leak enough memory to matter here, but maybe you have an idea of which
I have not thought.

We could certainly do better - just refcount each PartitionDesc. Have
the relcache entry hold a refcount on the PartitionDesc and have a
PartitionDirectory hold a refcount on the PartitionDesc; then, any
time the refcount goes to 0, you can immediately destroy the
PartitionDesc. Or, simpler, arrange things so that when the
refcache's refcount goes to 0, any old PartitionDescs that are still
hanging off of the latest one get destroyed then, not later. It's
just a question of whether it's really worth the code, or whether
we're fixing imaginary bugs by adding code that might have real ones.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-03-14 17:31:31 Re: why doesn't DestroyPartitionDirectory hash_destroy?
Previous Message Tom Lane 2019-03-14 17:16:01 Re: why doesn't DestroyPartitionDirectory hash_destroy?