Re: shared memory based stat collector (was: Sharing record typmods between backends)

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: shared memory based stat collector (was: Sharing record typmods between backends)
Date: 2017-08-14 23:16:39
Message-ID: 20170814231638.x6vgnzlr7eww4bui@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas wrote:
> On Mon, Aug 14, 2017 at 1:12 PM, Alvaro Herrera
> <alvherre(at)2ndquadrant(dot)com> wrote:
> > Yeah, the problem that lwlocks aren't released is because the launcher
> > is not in a transaction at that point, so AbortCurrentTransaction()
> > doesn't release locks like it normally would. The simplest fix (in the
> > attached 0001 patch) is to add a LWLockReleaseAll() call to the jmp
> > block, though I wonder if there should be some other cleanup functions
> > called from there, or whether perhaps it'd be a better strategy to have
> > the launcher run in a transaction at all times.
>
> Well, if you're going to do aborts outside of a transaction, just
> adding an LWLockReleaseAll() isn't really sufficient. You need to
> look at something like CheckpointerMain() and figure out which of
> those push-ups are needed here as well. Probably at least
> ConditionVariableCancelSleep(), pgstat_report_wait_end(),
> AbortBufferIO(), and UnlockBuffers() -- quite possibly some of those
> other AtEOXact calls as well.

Agreed. I think a saner answer is to create a single function that does
it all, and use it in all places that need it, rather than keep adding
more copies of the same thing. Attached revised version of patch does
things that way; I think it's good cleanup. (I had to make the
ResourceOwnerRelease call be conditional on there being a resowner;
seems okay to me.)

I put the new cleanup routine in xact.c, which is not exactly the
perfect place (had to add access/xact.h to a couple of files), but the
fact that the new routine is a cut-down version of another routine in
xact.c makes it clear to me that that's where it belongs. I first put
it in bootstrap, alongside AuxiliaryProcessMain, but after a while that
seemed wrong.

> > The other problem is that there's no attempt to handle a failed DSA
> > creation/attachment. The second patch just adds a PG_TRY block that
> > sets a flag not to try the DSA calls again if the first one fails. It
> > throws a single ERROR line, then autovacuum continues without
> > workitem support.
>
> Yeah, and the other question -- which Thomas asked before you
> originally committed originally, and which I just now asked again is
> "Why in the world are you using DSA for this at all?". There are
> serious problems with that which both he and I have pointed out, and
> you haven't explained why it's a good idea at any point, AFAICT.

The main reason is that I envision that the workitem stuff will be used
for other things in the future than just brin summarization, and it
seemed a lame idea to just use a fixed-size memory area in the standard
autovacuum shared memory area. I think unbounded growth is of course
going to be bad. The current coding doesn't allow for any growth beyond
the initial fixed size, but it's easier to extend the system from the
current point rather than wholly changing shared memory usage pattern
while at it.

I thought I *had* responded to Thomas in that thread, BTW.

> Among those problems:
>
> 1. It doesn't work if dynamic_shared_memory_type=none. That's OK for
> an optional subsystem, but autovacuum is not very optional.

Autovacuum as a whole continues to work if there's no dynamic shared
memory; it's just the workitems stuff that stops working if there's no
DSA. (After fixing the bug that makes it crash in the case of
dynamic_shared_memory_type=none, of course).

> 2. It allows unbounded bloat if there's no limit on the number of work
> items and is pointless is there is since you could then just use the
> main shared memory segment.

Yeah, there are probably better strategies than just growing the memory
area every time another entry is needed. Work-items as a whole need a
lot more development from the current point.

> I really think you should respond to those concerns, not just push a
> minimal fix.

We'll just continue to develop things from the current point.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
v2-0001-Release-lwlocks-in-autovacuum-launcher-error-hand.patch text/plain 8.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bossart, Nathan 2017-08-14 23:28:02 Re: [Proposal] Allow users to specify multiple tables in VACUUM commands
Previous Message Tom Lane 2017-08-14 23:11:58 Re: Crash report for some ICU-52 (debian8) COLLATE and work_mem values