Re: ALTER TABLE SET ACCESS METHOD on partitioned tables

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Soumyadeep Chakraborty <soumyadeep2007(at)gmail(dot)com>, Zhihong Yu <zyu(at)yugabyte(dot)com>, pgsql-hackers(at)postgresql(dot)org, Ashwin Agrawal <ashwinstar(at)gmail(dot)com>, vanjared(at)vmware(dot)com
Subject: Re: ALTER TABLE SET ACCESS METHOD on partitioned tables
Date: 2024-04-17 05:31:47
Message-ID: Zh9ew9wnwDsGjl0z@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 16, 2024 at 02:19:56PM +0900, Michael Paquier wrote:
> Actually, I've come up with an idea just after hitting the send
> button: let's use an extra ALTER TABLE SET ACCESS METHOD rather than
> rely on the GUC to set the AM of the partitioned table correctly.
> This extra command should be optional, depending on
> --no-table-access-method. If a partitioned table has 0 as relam,
> let's not add this extra ALTER TABLE at all.

I have explored this idea, and while this is tempting this faces a
couple of challenges:
1) Binary upgrades would fail because the table rewrite created by
ALTER TABLE SET ACCESS METHOD for relkinds with physical storage
expects heap_create_with_catalog to have a fixed OID, but the rewrite
would require extra steps to be able to handle that, and I am not
convinced that more binary_upgrade_set_next_heap_relfilenode() is a
good idea.
2) We could limit these extra ALTER TABLE commands to be generated for
partitioned tables. This is kind of confusing as resulting dumps
would mix SET commands for default_table_access_method that would
affect tables with physical storage, while partitioned tables would
have their own extra ALTER TABLE commands. Another issue that needs
more consideration is that TocEntrys don't hold any relkind
information so pg_backup_archiver.c cannot make a difference with
tables and partitioned tables to select if SET or ALTER TABLE should
be generated.

Several designs are possible, like:
- Mix SET and ALTER TABLE commands in the dumps to set the AM, SET for
tables and matviews, ALTER TABLE for relations without storage. This
would bypass the binary upgrade problem with the fixed relid.
- Use only SET, requiring a new "default" value for
default_table_access_method that would force a partitioned table's
relam to be 0. Be stricter with the "current" table AM tracked in
pg_dump's backup archiver.
- Use only ALTER TABLE commands, with extra binary upgrade tweaks to
force relation OIDs for the second heap_create_with_catalog() done
with the rewrite to update a relation's AM.

With all that in mind, it may be better to revert 374c7a229042 and
e2395cdbe83a from HEAD and reconsider how to tackle the dump issues in
v18 or newer versions as all of the approaches I can think of lead to
more complications of their own.

Please see attached a non-polished POC that switches dumps to use
ALTER TABLE, that I've used to detect the upgrade problems.

Thoughts or comments are welcome.
--
Michael

Attachment Content-Type Size
dump-partitions.patch text/x-diff 7.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-04-17 06:10:04 Re: Typo about the SetDatatabaseHasLoginEventTriggers?
Previous Message Andy Fan 2024-04-17 05:13:34 Re: Extract numeric filed in JSONB more effectively