Re: Make pg_prewarm, autoprewarm yield for waiting DDL

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Make pg_prewarm, autoprewarm yield for waiting DDL
Date: 2026-06-25 20:44:28
Message-ID: CALj2ACWmEhd27HLkt7EKK=oNtSttu+2UOtL1T+MB0hhOSzq1Tg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Wed, Mar 25, 2026 at 2:32 PM SATYANARAYANA NARLAPURAM <
satyanarlapuram(at)gmail(dot)com> wrote:
>
> Both pg_prewarm() and the autoprewarm background worker hold
AccessShareLock on the target relation for the entire duration of
prewarming. On large tables this can take a long time, which means
> that any DDL that needs a stronger lock (TRUNCATE, DROP TABLE, ALTER
TABLE, etc.) is blocked for the full duration.
>
> VACUUM already solves this same problem during heap truncation: it
periodically calls LockHasWaitersRelation() and backs off when a
conflicting waiter is detected (see lazy_truncate_heap()).
>
> The attached patch applies the same pattern to pg_prewarm and
autoprewarm. Every 1024 blocks, each code path checks for a waiter and if
found then the lock is released so that the DDL can proceed. Patch handles
the relation truncation, drop cases by emitting an error message. If the
relation was only partially truncated, the endpoint is adjusted downward
and prewarming continues.
> When no DDL is waiting, the only overhead is one lock-table probe per
1024 blocks.

A few comments:

1/
+/*
+ * Block interval for checking conflicting lock waiters during prewarming.
+ */
+#define PREWARM_WAITER_CHECK_INTERVAL 1024

Is this a random number, or was it chosen based on something like all 1024
pages being read from storage on a buffer pool miss and the time that
takes? I mainly want to check that this interval doesn't give too wide a
window, making a DDL wait longer than necessary. It's worth quickly
checking the worst case where all 1024 pages are read from storage.

Or, why not do something like what vacuum does - check every 32 blocks
whether the elapsed time of 20ms since the last check exceeds a threshold,
keeping the number of system calls and lock table lookups to a minimum?
2/
+ /*
+ * Recalculate fork size; skip remainder if
+ * truncated.
+ */
+ if (!smgrexists(RelationGetSmgr(rel), forknum))
+ {

I want to make sure I understand the table rewrite handling. If a
concurrent rewrite (ALTER TABLE, REINDEX, VACUUM FULL) happens while we're
prewarming, we release the lock and reopen the relation by OID. The reopen
succeeds because the OID is still valid, and the fork existence check
passes because relcache now points at the new relfilenode. But the block
list we're streaming was collected from the old relfilenode - so we end up
reading the new file at the old block numbers, and never actually prewarm
the right buffers. Am I reading that correctly?

Also, the reopen-and-restart path is fairly complex to reason about. Since
prewarming is best-effort anyway, would it be simpler to just release the
lock when we see a waiter, drop the blocks already collected for that
relation from the prewarm list,
and move on to the next relation? A relation that doesn't fully get
prewarmed isn't a correctness problem. Was there a specific reason for the
more involved restart approach?
> The patch includes a TAP test (t/002_lock_yield.pl) that exercises the
TRUNCATE and DROP TABLE scenarios using injection points.

Can we separate the injection point and test into a 0002 patch?
> While developing this patch I discovered that LockHasWaiters() crashes
with a segfault when the lock in question was acquired via the fast-path
optimization, details in [1].

I agree to fix that separately. I previously reviewed and tested the patch
there and it LGTM.
--
Bharath Rupireddy
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2026-06-25 20:46:05 Re: [BUG] ECPG crash with union type
Previous Message Tom Lane 2026-06-25 20:13:31 Re: pg_stat_statements: Remove (errcode...) framing parentheses in erport(...)