Re: ENOSPC FailedAssertion("!(RefCountErrors == 0)"

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ENOSPC FailedAssertion("!(RefCountErrors == 0)"
Date: 2018-07-18 00:01:05
Message-ID: 14892.1531872065@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
>> So I said I didn't want to do extra work on this, but I am looking into
>> fixing it by having these aux process types run a ResourceOwner that can
>> be told to clean up any open buffer pins at exit.

> That turned out to be a larger can of worms than I'd anticipated, as it
> soon emerged that we'd acquired a whole lot of cargo-cult programming
> around ResourceOwners. ...
> I'm mostly pretty happy with this, but I think there are a couple of
> loose ends in logicalfuncs.c and slotfuncs.c: those are creating
> non-standalone ResourceOwners (children of whatever the active
> ResourceOwner is) and doing nothing much to clean them up. That seems
> pretty bogus.

Further investigation showed that the part of that code that was
actually needed was not the creation of a child ResourceOwner, but rather
restoration of the old CurrentResourceOwner setting after exiting the
logical decoding loop. Apparently we can come out of that with the
TopTransaction resowner being active. This still seems a bit bogus;
maybe there should be a save-and-restore happening somewhere else?
But I'm not really excited about doing more than commenting it.

Also, most of the other random creations of ResourceOwners seem to just
not be necessary at all, even with the new rule that you must have a
resowner to acquire buffer pins. So the attached revised patch just
gets rid of them, and improves some misleading/wrong comments on the
topic. It'd still be easy to use CreateAuxProcessResourceOwner in any
process where we discover we need one, but I don't see the value in adding
useless overhead.

At this point I'm leaning to just applying this in HEAD and calling it
good. The potential for assertion failures isn't relevant to production
builds, and my best guess at this point is that there isn't really any
other user-visible bug. The resowners that were being created and not
adequately cleaned up all seem to have been dead code anyway (ie they'd
never acquire any resources). We could have a problem if, say, the
bgwriter exited via the FATAL path while holding a pin, but I don't think
there's a reason for it to do that except in a database-wide shutdown,
where the consequences of a leaked pin seem pretty minimal.

Any objections? Anyone want to do further review?

regards, tom lane

Attachment Content-Type Size
use-resowner-in-all-aux-processes-2.patch text/x-diff 20.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-07-18 00:30:57 Re: Another usability issue with our TAP tests
Previous Message Tomas Vondra 2018-07-17 23:58:06 Re: [HACKERS] PATCH: multivariate histograms and MCV lists