Re: Adding REPACK [concurrently]

From: Antonin Houska <ah(at)cybertec(dot)at>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com>, Srinath Reddy Sadipiralla <srinath2133(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Treat <rob(at)xzilla(dot)net>
Subject: Re: Adding REPACK [concurrently]
Date: 2026-03-25 16:52:56
Message-ID: 50825.1774457576@localhost
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Antonin Houska <ah(at)cybertec(dot)at> wrote:

> Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>
> > - 0008 to 0010 are as posted by Antonin; they are unchanged, except for
> > fixes for the problems pointed out by Mihail. Antonin, I would
> > appreciate it if you want to change the "reform" bit in 0007 as
> > discussed.
>
> I've taken a look, but not sure if the tuple slots help here. In
> heapam_relation_copy_for_cluster(), both table_scan_getnextslot() and
> index_getnext_slot() call ExecStoreBufferHeapTuple() ->
> tts_buffer_heap_store_tuple(), which AFAICS do not deform the tuple. Then
> ExecFetchSlotHeapTuple() is used to retrieve the tuple, but again, the
> underlying slot (TTSOpsBufferHeapTuple) handles it by copying rather than
> deforming / forming. Thus I think the explicit "reforming" currently does not
> add any performance overhead.

Well, the deform / form steps do add some overhead of course, but these are
necessary to get rid of the values of the dropped columns. I wanted to say
that it wouldn't be cheaper with slots, because then we'd have to enforce the
deform / form steps too, although the coding would be different:

> Of course, we can still use the slots, and do the following: 1) enforce tuple
> deforming (by calling slot_getallattrs()), 2) set the dropped attributes to
> NULL, 3) use ExecStoreVirtualTuple() to store the tuple into another slot and
> 4) get the heap tuple from the other slot. Should I do that? I'm asking
> because I wasn't sure if you're concerned about performance or coding (or
> both).

> Whatever approach we take, I see two more opportunities for better
> performance:
>
> 1. Do the "reforming" only if there are some dropped columns. (AFAICS even the
> old CLUSTER / VACUUM FULL did not check this.)

I think this would need more work because CLUSTER / VACCUM FULL / REPACK do
not remove the dropped attributes from the tuple descriptor. So the
optimization would only work until the first column is dropped. All the
following runs would then do the reforming even if no other colmns were droped
since the previous run.

Perhaps we can teach REPACK to remove dropped columns from the tuple
descriptor in the future.

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Sami Imseih 2026-03-25 17:12:35 Re: Track skipped tables during autovacuum and autoanalyze
Previous Message Masahiko Sawada 2026-03-25 16:48:52 Re: pg_buffercache: Add per-relation summary stats