Re: shared memory based stat collector (was: Sharing record typmods between backends)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: shared memory based stat collector (was: Sharing record typmods between backends)
Date: 2017-08-14 19:09:58
Message-ID: CA+TgmoaYofBKKXBUcfgb9Rd5cKvZL7zhqHBbnUnNO+sdGpeVpA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 14, 2017 at 1:12 PM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> Yeah, the problem that lwlocks aren't released is because the launcher
> is not in a transaction at that point, so AbortCurrentTransaction()
> doesn't release locks like it normally would. The simplest fix (in the
> attached 0001 patch) is to add a LWLockReleaseAll() call to the jmp
> block, though I wonder if there should be some other cleanup functions
> called from there, or whether perhaps it'd be a better strategy to have
> the launcher run in a transaction at all times.

Well, if you're going to do aborts outside of a transaction, just
adding an LWLockReleaseAll() isn't really sufficient. You need to
look at something like CheckpointerMain() and figure out which of
those push-ups are needed here as well. Probably at least
ConditionVariableCancelSleep(), pgstat_report_wait_end(),
AbortBufferIO(), and UnlockBuffers() -- quite possibly some of those
other AtEOXact calls as well.

> The other problem is that there's no attempt to handle a failed DSA
> creation/attachment. The second patch just adds a PG_TRY block that
> sets a flag not to try the DSA calls again if the first one fails. It
> throws a single ERROR line, then autovacuum continues without workitem
> support.

Yeah, and the other question -- which Thomas asked before you
originally committed originally, and which I just now asked again is
"Why in the world are you using DSA for this at all?". There are
serious problems with that which both he and I have pointed out, and
you haven't explained why it's a good idea at any point, AFAICT.
Among those problems:

1. It doesn't work if dynamic_shared_memory_type=none. That's OK for
an optional subsystem, but autovacuum is not very optional.

and

2. It allows unbounded bloat if there's no limit on the number of work
items and is pointless is there is since you could then just use the
main shared memory segment.

I really think you should respond to those concerns, not just push a
minimal fix.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-08-14 19:17:17 Re: Crash report for some ICU-52 (debian8) COLLATE and work_mem values
Previous Message Andres Freund 2017-08-14 18:56:32 Re: Orphaned files in base/[oid]