| From: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
|---|---|
| To: | Antonin Houska <ah(at)cybertec(dot)at> |
| Cc: | Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Treat <rob(at)xzilla(dot)net> |
| Subject: | Re: Adding REPACK [concurrently] |
| Date: | 2026-03-12 19:31:40 |
| Message-ID: | 202603121839.7ttist64mas5@alvherre.pgsql |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 2026-Mar-11, Antonin Houska wrote:
> I'm not sure it can be fixed nicely in the REPACK (CONCURRENTLY) patch. I
> think the problem is that, in the current tree, VARSIZE_ANY() is used in such
> a way that the compiler cannot check the "array bounds". The restore_tuple()
> function is special in that it uses VARSIZE_ANY() to check a variable
> allocated from the stack, so the compiler can check the size.
>
> I'm trying to fix that in a new diff 0002 - the point is that VARSIZE_ANY()
> should not need to dereference a pointer to varattrib_4b, since the size
> information is always located at the beginning of the structure. Maybe you
> have better idea.
I have no immediate ideas on this.
I offer the following rather trivial fixup diffs, which I think should
be mostly be for 0002.
Other trivial things I'd like to change, if you don't mind,
- the name pgoutput_repack sounds wrong to me. I would rather say
rpck_output, repack_output, repack_plugin, ... or something. I don't
understand where the suffix "output" comes from in the name
"pgoutput", but it sounds like arbitrary nonsense to me.
- I would rename the routines in that file to also have the name
"repack", probably as prefixes.
One thing we need and is rather not trivial, is to go through the table
AM interface rather than calling heapam.c routines directly. I'm
working on this now and will present a patch later.
Another thing I noticed while going over the code was that we seem to
spill whole tuples even for things like the old tuple of an UPDATE and
for DELETE, but unless I misunderstand how this works, these shouldn't
be necessary: we just need the replica identity so that we can locate
the tuple to operate on. Especially for tuples that contain large
toasted attributes, this is likely important.
It may make sense to use the TupleTableSlot interface rather than
HeapTuple for everything. I'm not really sure about this though.
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"Most hackers will be perfectly comfortable conceptualizing users as entropy
sources, so let's move on." (Nathaniel Smith)
https://mail.gnu.org/archive/html/monotone-devel/2007-01/msg00080.html
| Attachment | Content-Type | Size |
|---|---|---|
| 0001-rewrite-infinite-loop-in-our-style.nocfbot.txt | text/plain | 871 bytes |
| 0002-rename-setup_logical_decoding-to-repack_setup_logica.nocfbot.txt | text/plain | 2.0 KB |
| 0003-remove-excess-parens-around-ereport.nocfbot.txt | text/plain | 7.4 KB |
| 0004-XLogRecPtrIsInvalid-XLogRecPtrIsValid.nocfbot.txt | text/plain | 1.8 KB |
| 0005-document-store_change-somewhat-more.nocfbot.txt | text/plain | 1.2 KB |
| 0006-use-int-instead-of-uint32-for-the-result-of-list_len.nocfbot.txt | text/plain | 2.0 KB |
| 0007-no-need-for-palloc0-here-simple-palloc-is-enough.nocfbot.txt | text/plain | 1.0 KB |
| 0008-XXX-devel-comment-heap_deform_tuple-slot_deform_tupl.nocfbot.txt | text/plain | 1.3 KB |
| 0009-reuse-existing-variable-by-overwriting-it.nocfbot.txt | text/plain | 1.2 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Corey Huinker | 2026-03-12 19:41:54 | Re: Import Statistics in postgres_fdw before resorting to sampling. |
| Previous Message | Tomas Vondra | 2026-03-12 19:27:32 | Re: Why clearing the VM doesn't require registering vm buffer in wal record |