| From: | Antonin Houska <ah(at)cybertec(dot)at> |
|---|---|
| To: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
| Cc: | Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Treat <rob(at)xzilla(dot)net> |
| Subject: | Re: Adding REPACK [concurrently] |
| Date: | 2026-03-13 08:11:38 |
| Message-ID: | 8349.1773389498@localhost |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> I offer the following rather trivial fixup diffs, which I think should
> be mostly be for 0002.
Thanks, just a few comments:
* 0002 - I didn't add the 'repack_' prefix because the function is static, but
realize now that it might be useful from the code browsing perspective.
* 0008 - It's possible during query execution, when you know exactly which
attributes you need to fetch from the tuple. However in store_change(), we
don't know which attributes are external (indirect) without deforming them
all.
> Other trivial things I'd like to change, if you don't mind,
> - the name pgoutput_repack sounds wrong to me. I would rather say
> rpck_output, repack_output, repack_plugin, ... or something. I don't
> understand where the suffix "output" comes from in the name
> "pgoutput", but it sounds like arbitrary nonsense to me.
No strong preference here, except that I don't like "rpck_...". How about
pgoutput/repack.c? I think I tried to put the plugin into the existing
"pgoutput" directory at some point, but don't remember why I eventually
rejected that approach.
> - I would rename the routines in that file to also have the name
> "repack", probably as prefixes.
ok
> One thing we need and is rather not trivial, is to go through the table
> AM interface rather than calling heapam.c routines directly. I'm
> working on this now and will present a patch later.
It occurred to me too, at least because copy_table_data() calls
table_relation_copy_for_cluster() rather than
heapam_relation_copy_for_cluster(). Thanks for working on it.
> Another thing I noticed while going over the code was that we seem to
> spill whole tuples even for things like the old tuple of an UPDATE and
> for DELETE, but unless I misunderstand how this works, these shouldn't
> be necessary: we just need the replica identity so that we can locate
> the tuple to operate on. Especially for tuples that contain large
> toasted attributes, this is likely important.
I think we don't need to care, as both heap_update() and heap_delete() call
ExtractReplicaIdentity(). That returns a real tuple, but it only contains the
attributes constituting the replica identity. The other ones are set to
NULL.
> It may make sense to use the TupleTableSlot interface rather than
> HeapTuple for everything. I'm not really sure about this though.
Isn't this part of the adoption of table AM? For example, table_tuple_insert()
accepts tuple slot rather than heap tuple.
--
Antonin Houska
Web: https://www.cybertec-postgresql.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Jelte Fennema-Nio | 2026-03-13 08:29:26 | Re: Add "format" target to make and ninja to run pgindent and pgperltidy |
| Previous Message | Alexander Lakhin | 2026-03-13 08:00:00 | Re: Buffer locking is special (hints, checksums, AIO writes) |