Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock
Date: 2022-08-17 00:38:31
Message-ID: 20220817003831.yimffpi4uzkot3fz@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-08-16 17:02:27 -0400, Tom Lane wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > I had that thought too, but I don't *think* it's the case. This
> > function acquires a lock on the oldest bucket page, then on the new
> > bucket page. We could deadlock if someone who holds a pin on the new
> > bucket page tries to take a content lock on the old bucket page. But
> > who would do that? The new bucket page isn't yet linked from the
> > metapage at this point, so no scan should do that. There can be no
> > concurrent writers during replay. I think that if someone else has the
> > new page pinned they probably should not be taking content locks on
> > other buffers at the same time.
>
> Agreed, the core code shouldn't do that, but somebody doing random stuff
> with pageinspect functions could probably make a query do this.
> See [1]; unless we're going to reject that bug with "don't do that",
> I'm not too comfortable with this line of reasoning.

I don't think we can defend against lwlock deadlocks where somebody doesn't
follow the AM's deadlock avoidance strategy. I.e. it's fine to pin and lock
pages from some AM without knowing that AM's rules, as long as you only block
while holding a pin/lock of a single page. But it is *not* ok to block waiting
for an lwlock / pin while already holding an lwlock / pin on some other
buffer. If we were concerned about this we'd have to basically throw many of
our multi-page operations that rely on lock order logic out.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-08-17 00:57:40 Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock
Previous Message Tom Lane 2022-08-16 23:55:18 Re: hash_xlog_split_allocate_page: failed to acquire cleanup lock