Re: hio.c does visibilitymap_pin()/IO while holding buffer lock

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, Tomas Vondra <tv(at)fuzzy(dot)cz>, Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Subject: Re: hio.c does visibilitymap_pin()/IO while holding buffer lock
Date: 2023-03-25 13:34:25
Message-ID: 0c11032a-d91b-8875-ab7b-4554776390c5@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/25/23 03:57, Andres Freund wrote:
> Hi,
>
> Starting with
>
> commit 7db0cd2145f2bce84cac92402e205e4d2b045bf2
> Author: Tomas Vondra <tomas(dot)vondra(at)postgresql(dot)org>
> Date: 2021-01-17 22:11:39 +0100
>
> Set PD_ALL_VISIBLE and visibility map bits in COPY FREEZE
>

That's a bummer :-(

> RelationGetBufferForTuple does
>
> /*
> * The page is empty, pin vmbuffer to set all_frozen bit.
> */
> if (options & HEAP_INSERT_FROZEN)
> {
> Assert(PageGetMaxOffsetNumber(BufferGetPage(buffer)) == 0);
> visibilitymap_pin(relation, BufferGetBlockNumber(buffer), vmbuffer);
> }
>
> while holding a buffer lock. visibilitymap_pin() reads pages, if vmbuffer
> doesn't already point to the right block.
>
>
> The lock ordering rules are to lock VM pages *before* locking heap pages.
>
>
> I think the reason this hasn't yet bitten us badly, is that INSERT_FROZEN
> effectively requires that the relation is access exclusive locked. There
> shouldn't be other backends locking multiple buffers in the relation (bgwriter
> / checkpointer can lock a single buffer at a time, but that's it).
>

Right. Still, it seems a bit fragile ...

>
> I see roughly two ways forward:
>
> 1) We add a comment explaining why it's safe to violate lock ordering rules in
> this one situation
>

Possible, although I feel uneasy about just documenting a broken rule.
Would be better to maintain the locking order.

> 2) Change relevant code so that we only return a valid vmbuffer if we could do
> so without blocking / IO and, obviously, skip updating the VM if we
> couldn't get the buffer.
>

I don't recall the exact details about the vm locking/pinning, but can't
we just ensure we actually follow the proper locking order? I mean, this
only deals with new pages, requested at line ~624:

buffer = ReadBufferBI(relation, P_NEW, RBM_ZERO_AND_LOCK, bistate);

Can't we ensure we actually lock the vm buffer too in ReadBufferBI,
before calling ReadBufferExtended? Or am I confused and that's not
possible for some reason?

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-03-25 15:41:47 Re: About the constant-TRUE clause in reconsider_outer_join_clauses
Previous Message Andrew Dunstan 2023-03-25 12:46:42 meson/msys2 fails with plperl/Strawberry