Re: Adding REPACK [concurrently]

From: Antonin Houska <ah(at)cybertec(dot)at>
To: Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Treat <rob(at)xzilla(dot)net>
Subject: Re: Adding REPACK [concurrently]
Date: 2026-01-20 15:39:10
Message-ID: 73630.1768923550@localhost
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com> wrote:

> > if (size >= MaxAllocSize)
> Seems like we lost that check, I think it may be executed on storing
> the data

The tuple we process in store_change was created elsewhere (I think in
reorderbuffer.c), so I wouldn't re-check the size here.

> or before "tup = (HeapTuple) palloc(HEAPTUPLESIZE + t_len);"
> in apply_concurrent_changes

It's essentially the same length that we write in store_change() so I wouldn't
bother re-checking it here. In general, I doubt the constant is
appropriate. Its meaning is much more generic than the size of memory for a
tuple and even heap_form_tuple() does not use it.

> > bool done;
> I still think it is a confusing name.

I don't. The last call of process_concurrent_changes() tells the worker "Give
me the the next batch and we are done". Your proposal "exit_after_lsn_upto"
seems to me too verbose: the worker itself is supposed to know that it has to
reach the LSN passed via another argument.

> > chgdst.file_seq = WORKER_FILE_SNAPSHOT + 1;
> I think it is better to increment it once a snapshot is received.

The 'chgdst' is only defined in rebuild_relation_finish_concurrent(), no need
to use it where the snapshot is received (in rebuild_relation()).

> And rename to last_processed/last_improrted to be aligned with
> last_exported.

While DecodingWorkerShared deals with multiple files (not all of them
necessarily available for "consumer") and therefore it makes sense to
distinguish if file is exported or not, each instance of ChangeDest is
assigned particular file and the functions using it do not care if the file is
the last in the sequence or not.

Other proposals accepted, will be reflected in the next version.

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2026-01-20 15:43:48 Re: log_min_messages per backend type
Previous Message Álvaro Herrera 2026-01-20 15:07:43 Re: file_fdw: Support multi-line HEADER option.