Re: Modest proposal to extend TableAM API for controlling cluster commands

From: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Modest proposal to extend TableAM API for controlling cluster commands
Date: 2022-06-16 06:23:09
Message-ID: FE4891BD-BB76-4212-B3D9-D0F55C2DDB2B@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Jun 15, 2022, at 8:50 PM, David G. Johnston <david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
>
> On Wed, Jun 15, 2022 at 8:18 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > If a simple callback like
> > relation_supports_cluster(Relation rel) is too simplistic
>
> Seems like it should be called: relation_supports_compaction[_by_removal_of_interspersed_dead_tuples]

Ok.

> Basically, if the user tells the table to make itself smaller on disk by removing dead tuples, should we support the case where the Table AM says: "Sorry, I cannot do that"?

I submit that's the only sane thing to do if the table AM already guarantees that the table will always be fully compacted. There is no justification for forcing the table contents to be copied without benefit.

> If yes, then naming the table explicitly should elicit an error. Having the table chosen implicitly should provoke a warning. For ALTER TABLE CLUSTER there should be an error: which makes the implicit CLUSTER command a non-factor.

I'm basically fine with how you would design the TAM, but I'm going to argue again that the core project should not dictate these decisions. The TAM's relation_supports_compaction() function can return true/false, or raise an error. If raising an error is the right action, the TAM can do that. If the core code makes that decision, the TAM can't override, and that paints TAM authors into a corner.

> However, given that should the table structure change it is imperative that the Table AM be capable of producing a new physical relation with the correct data, which will have been compacted as a side-effect, it seems like, explicit or implicit, expecting any Table AM to do that when faced with Vacuum Full is reasonable. Which leaves deciding how to allow a table with a given TAM to prevent itself from being added to the CLUSTER roster. And decide whether an opt-out feature for implicit VACUUM FULL is something we should offer as well.
>
> I'm doubtful that a TAM that is pluggable into the MVCC and WAL architecture of PostgreSQL could avoid this basic contract between the system and its users.

How about a TAM that implements a write-once, read-many logic. You get one multi-insert, and forever after you can't modify it (other than to drop the table, or perhaps to truncate it). That's a completely made-up-on-the-spot example, but it's not entirely without merit. You could avoid a lot of locking overhead when using such a table, since you'd know a priori that nobody else is modifying it. It could also be implemented with a smaller tuple header, since a lot of the header bytes in heap tuples are dedicated to tracking updates. You wouldn't need a per-row inserting transaction-Id either, since you could just store one per table, knowing that all the rows were inserted in the same transaction.

In what sense does this made-up TAM conflict with mvcc and wal? It doesn't have all the features of heap, but that's not the same thing as violating mvcc or breaking wal.


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Yugo NAGATA 2022-06-16 06:42:06 Re: Prevent writes on large objects in read-only transactions
Previous Message Michael Paquier 2022-06-16 06:14:16 Re: Bump MIN_WINNT to 0x0600 (Vista) as minimal runtime in 16~