From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: hyrax vs. RelationBuildPartitionDesc |
Date: | 2019-05-17 19:59:29 |
Message-ID: | 20190517195929.uey6hadbgazplm3h@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2019-05-01 13:09:07 -0400, Robert Haas wrote:
> The original issue on this thread was that hyrax started running out
> of memory when it hadn't been doing so previously. That happened
> because, for complicated reasons, commit
> 898e5e3290a72d288923260143930fb32036c00c resulted in the leak being
> hit lots of times in CLOBBER_CACHE_ALWAYS builds instead of just once.
> Commits 2455ab48844c90419714e27eafd235a85de23232 and
> d3f48dfae42f9655425d1f58f396e495c7fb7812 fixed that problem.
>
> In the email at issue, Tom complains about two things. First, he
> complains about the fact that I try to arrange things so that relcache
> data remains valid for as long as required instead of just copying it.
> Second, he complains about the fact repeated ATTACH and DETACH
> PARTITION operations can produce a slow session-lifespan memory leak.
>
> I think it's reasonable to fix the second issue for v12. I am not
> sure how important it is, because (1) the leak is small, (2) it seems
> unlikely that anyone would execute enough ATTACH/DETACH PARTITION
> operations in one backend for the leak to amount to anything
> significant, and (3) if a relcache flush ever happens when you don't
> have the relation open, all of the "leaked" memory will be un-leaked.
> However, I believe that we could fix it as follows. First, invent
> rd_pdoldcxt and put the first old context there; if that pointer is
> already in use, then parent the new context under the old one.
> Second, in RelationDecrementReferenceCount, if the refcount hits 0,
> nuke rd_pdoldcxt and set the pointer back to NULL. With that change,
> you would only keep the old PartitionDesc around until the ref count
> hits 0, whereas at present, you keep the old PartitionDesc around
> until an invalidation happens while the ref count is 0.
That sounds roughly reasonably. Tom, any objections? I think it'd be
good if we fixed this soon.
> I think the first issue is not v12 material. Tom proposed to fix it
> by copying all the relevant data out of the relcache, but his own
> performance results show a pretty significant hit,
Yea, the numbers didn't look very convincing.
> and AFAICS he
> hasn't pointed to anything that's actually broken by the current
> approach. What I think should be done is actually generalize the
> approach I took in this patch, so that instead of the partition
> directory holding a reference count, the planner or executor holds
> one. Then not only could people who want information about the
> PartitionDesc avoid copying stuff from the relcache, but maybe other
> things as well. I think that would be significantly better than
> continuing to double-down on the current copy-everything approach,
> which really only works well in a world where a table can't have all
> that much metadata, which is clearly not true for PostgreSQL any more.
> I'm not sure that Tom is actually opposed to this approach -- although
> I may have misunderstood his position -- but where we disagree, I
> think, is that I see what I did in this commit as a stepping-stone
> toward a better world, and he sees it as something that should be
> killed with fire until that better world has fully arrived.
Tom, are you ok with deferring further work here for v13?
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2019-05-17 20:04:30 | Re: Passing CopyMultiInsertInfo structure to CopyMultiInsertInfoNextFreeSlot() |
Previous Message | Andres Freund | 2019-05-17 19:54:19 | Re: Pluggable Storage - Andres's take |