Re: Extensible Rmgr for Table AMs

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: Extensible Rmgr for Table AMs
Date: 2022-04-05 00:25:55
Message-ID: 3ac6b783ea57e1310c824e436ab693b304877cf8.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 2022-04-04 at 11:15 +0530, Bharath Rupireddy wrote:
> 1) Why can't rmid be chosen by the extensions in sequential order
> from
> (129 till 255), say, to start with a columnar extension can choose
> 129, another extension can register 130 and so on right?

I'm not sure what you mean by "chosen by the extensions in sequential
order". If you mean:

(a) that it would depend on the library load order, and that postgres
would assign the ID; then that won't work because the rmgr IDs need to
be stable across restarts and config changes

(b) that there should be a convention where people reserve IDs on the
wiki page in sequential order; then that's fine with me

> 2) RM_MAX_ID is now UINT8_MAX - do we need to make it configurable?
> or
> do we think that only a few (127) custom rmgrs can exist?

127 should be plenty for quite a long time. If we need more, we can
come up with a different solution (which is OK because the WAL format
changes in new major versions).

> 3) Do we need to talk about extensible rmgrs and
> https://wiki.postgresql.org/wiki/ExtensibleRmgr in documentation
> somewhere? This will help extension developers to refer when needed.

We don't have user-facing documentation for every extension hook, and I
don't think it would be a good idea to document this one (at least in
the user-facing docs). Otherwise, we'd have to document the whole
resource manager interface, and that seems like way too much of the
internals.

> 4) Do we need to change this description?
> <para>
> The default value of this setting is the empty string, which
> disables
> the feature. It can be set to <literal>all</literal> to
> check all
> records, or to a comma-separated list of resource managers to
> check
> only records originating from those resource
> managers. Currently,
> the supported resource managers are <literal>heap</literal>,
> <literal>heap2</literal>, <literal>btree</literal>,
> <literal>hash</literal>,
> <literal>gin</literal>, <literal>gist</literal>,
> <literal>sequence</literal>,
> <literal>spgist</literal>, <literal>brin</literal>, and
> <literal>generic</literal>. Onl
> superusers can change this setting.

Yes, you're right, I updated the docs for pg_waldump and the
wal_consistency_checking GUC.

> 5) Do we need to add a PGDLLIMPORT qualifier to RegisterCustomRmgr()
> (possibly for RmgrStartup, RmgrTable, RmgrCleanup as well?) so that
> the Windows versions of extensions can use this feature?

That seems to only be required for variables, not functions.

I don't think we want to mark RmgrTable this way, because it's not
intended to be accessed by extensions directly. It's accessed by
GetRmgr(), which is 'static inline', but that's also not intended to be
used by extensions.

> 6) Should the below strcmp pg_strcmp? The custom rmgrs can still use
> rm_name with different cases and below code fail to catch it?
> + if (!strcmp(existing_rmgr->rm_name, rmgr->rm_name))
> + ereport(PANIC,
> + (errmsg("custom rmgr \"%s\" has the same name as builtin rmgr",
> + existing_rmgr->rm_name)));

There are already a lot of places where users can choose identifiers
that differ only in uppercase/lowercase. I don't see a reason to make
an exception here.

> 7) What's the intention of the below code? It seems like below it
> checks if there's already a rmgr with the given name (RM_MAX_ID). Had
> it been RM_MAX_BUILTIN_ID instead of RM_MAX_ID, the error message
> does make sense. Is the intention here to not have duplicate rmgrs in
> the entire RM_MAX_ID(both builtin and custom)?

Thank you. I redid a lot of the error messages and checked them out to
make sure they are helpful.

> 8) Per https://wiki.postgresql.org/wiki/ExtensibleRmgr, 128 is
> reserved, can we have it as a macro? Or something like
> (RM_MAX_ID/2+1)

It's already defined as RM_EXPERIMENTAL_ID. Is that what you meant or
did I misunderstand?

I don't see the point in defining it as RM_MAX_ID/2+1.

> 9) Thinking if there's a way to test the code, maybe exposing
> RegisterCustomRmgr as a function? I think we need to have a dummy
> extension that will be used for testing this kind of patches/features
> but that's a separate discussion IMO.

It's already exposed as a C function in xlog_internal.h.

You mean as a SQL function? I don't think that makes sense. It can only
be called while processing shared_preload_libraries (on server startup)
anyway.

Other changes this version:

* implemented Andres's proposed performance change[1]
* fix wal_consistency_checking
* general cleanup

Regards,
Jeff Davis

[1]
https://postgr.es/m/20220404043337.ocjnni7hknjsibhg@alap3.anarazel.de

Attachment Content-Type Size
v7-0001-Extensible-rmgr.patch text/x-patch 22.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-04-05 00:39:58 Re: Run pg_amcheck in 002_pg_upgrade.pl and 027_stream_regress.pl?
Previous Message Noah Misch 2022-04-05 00:21:07 Re: Skipping logical replication transactions on subscriber side