Re: buildfarm: strange OOM failures on markhor (running CLOBBER_CACHE_RECURSIVELY)

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tomas Vondra <tv(at)fuzzy(dot)cz>, pgsql-hackers(at)postgresql(dot)org, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: buildfarm: strange OOM failures on markhor (running CLOBBER_CACHE_RECURSIVELY)
Date: 2014-05-19 16:27:42
Message-ID: 20140519162742.GE11150@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2014-05-19 11:25:04 -0400, Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > On 2014-05-18 14:49:10 -0400, Tom Lane wrote:
> >> if (RelationHasReferenceCountZero(oldrel))
> >> RelationDestroyRelation(oldrel, false);
> >> else
> >> elog(WARNING, "leaking still-referenced duplicate relation");
>
> > If that happens we'd essentially have a too low reference count on the
> > entry remaining in the relcache.
>
> No, we'd have two independent entries, each with its own correct refcount.
> When the refcount on the no-longer-linked-in-the-hashtable entry goes to
> zero, it'd be leaked, same as it's always been. (The refcount presumably
> corresponds to someone holding a direct pointer to the Relation struct,
> which is what they'd use to decrement the refcount.)

The problem is that only one of these entries will get properly handled
by cache invalidation. I wonder if the correct thing wouldn't be to
return the entry already in the cache. But that'd not be trivial to do
either, without the potential to return a invalid entry :(

> > I'd consider putting an Assert() in that branch.
>
> I'm a bit afraid to do that for a condition that the system's never tested
> for at all up to now; in any case, a WARNING would be visible in production
> whereas an Assert would probably do nothing. Ifentry we see no reports of this
> WARNING for a release cycle or so, maybe asserting would be appropriate.

Fair point.

> > Perhaps it should also be only allowed for system
> > relations?
>
> One would hope those are the only ones that get opened during relcache
> load ;-)

I thought about it for a while and I wonder if that's necessarily
correct. If somebody registers a relcache invalidation callback that
could happen when invalidations are processed somewhere while rebuilding
a entry?

> Still concerned about RememberToFreeTupleDescAtEOX, but that's an
> independent issue really.

Me too and yes.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Christoph Berg 2014-05-19 16:53:15 Re: 9.4 beta1 crash on Debian sid/i386
Previous Message David Johnston 2014-05-19 15:45:06 Re: 9.4 release notes