Re: Adding REPACK [concurrently]

From: Antonin Houska <ah(at)cybertec(dot)at>
To: Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Treat <rob(at)xzilla(dot)net>
Subject: Re: Adding REPACK [concurrently]
Date: 2026-03-02 14:39:01
Message-ID: 67772.1772462341@localhost
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com> wrote:

> Some review comments:

Thanks again!

> ------------
>
> > attrs = palloc0_array(Datum, desc->natts);
> > isnull = palloc0_array(bool, desc->natts);
>
> It looks like there is a memory leak with those arrays.

I suppose you mean store_change(). Yes, I tried to free the individual chunks
and forgot these. The next version uses a new, per-change memory context.

> > ident_idx = RelationGetReplicaIndex(rel);
> > if (!OidIsValid(ident_idx) && OidIsValid(rel->rd_pkindex))
>
> check_repack_concurrently_requirements uses rd_pkindex as fallback.
>
> But rebuild_relation_finish_concurrent does not contain such logic:
>
> > ident_idx_old = RelationGetReplicaIndex(OldHeap);

Good point. I added a new argument to rebuild_relation_finish_concurrent() so
that the identity index is only determined once.

> ------------
>
> > >
> > > > ConditionVariablePrepareToSleep(&shared->cv);
> > > > for (;;)
> > > > {
> > > > bool initialized;
> > > >
> > > > SpinLockAcquire(&shared->mutex);
> > > > initialized = shared->initialized;
> > > > SpinLockRelease(&shared->mutex);
> > > src/backend/commands/cluster.c:3663
> > >
> > > I think we should check GetBackgroundWorkerPid for worker status, to
> > > not wait forever in case of some issue with the worker.
>
> > ConditionVariableSleep() calls CHECK_FOR_INTERRUPTS(). That should process
> > error messages from the worker.
>
> Hm, yes, and RepackWorkerShutdown will detach the queue. But
> ProcessRepackMessages does not react somehow to SHM_MQ_DETACHED - just
> ignores. Or am I missing something?

On ERROR / FATAL, RepackWorkerShutdown() should send the message before
detaching. elog.c does it via send_message_to_frontend(), due to the previous
call of pq_redirect_to_shm_mq() in RepackWorkerMain(). ProcessRepackMessages()
then only needs to care about the message, not about the worker's detaching.

> ------------
>
> > build_identity_key
> > ....
> > n = ident_idx->indnatts;
>
> Should we use indnkeyatts here?

Definitely. I missed the addition of the INCLUDE columns feature during
maintenance of pg_squeeze, and copied the bug to REPACK. Fixed.

> ------------
>
> > build_identity_key
> > ....
> > entry->sk_collation = att->attcollation;
>
> Should we use index collation (not heap) here?
> entry->sk_collation = ident_idx_rel->rd_indcollation[i];

AFAIC they should be equal, but what you propose simplifies the code a
bit. Done.

> ------------
>
> > SnapBuildInitialSnapshotForRepack
>
> What is about to add defensive checks like SnapBuildInitialSnapshot does?
>
> > if (!builder->committed.includes_all_transactions)
> > elog(ERROR, "cannot build an initial slot snapshot, not all transactions are monitored anymore");

Initially I added a header comment (XXX) to
SnapBuildInitialSnapshotForRepack() saying that some of the checks, including
this one, could be adopted. The checks were problematic in the backend
executing REPACK. However, they appear to be fine if the code is executed by
the logical decoding worker. So what I'm trying now is to add a new argument
"repack" to SnapBuildInitialSnapshot() and remove the original 0003 diff
("Move conversion of a historic to MVCC snapshot to a separate function.")
from the series altogether.

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

Attachment Content-Type Size
v37-0001-Add-REPACK-command.patch text/x-diff 109.3 KB
v37-0002-Refactor-index_concurrently_create_copy-for-use-with.patch text/x-diff 8.7 KB
v37-0003-Add-CONCURRENTLY-option-to-REPACK-command.patch text/plain 165.0 KB
v37-0004-Serialize-decoded-tuples-without-flattening.patch text/x-diff 20.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2026-03-02 14:44:05 Re: pg_stat_replication.*_lag sometimes shows NULL during active replication
Previous Message Madyshev Egor 2026-03-02 14:12:09 RE: Idea to enhance pgbench by more modes to generate data (multi-TXNs, UNNEST, COPY BINARY)