Re: generate syscache info automatically

From: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
To: Peter Eisentraut <peter(at)eisentraut(dot)org>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: generate syscache info automatically
Date: 2023-06-15 07:45:22
Message-ID: CAFBsxsHMaZ9yR7p=JpzdC2ynKwkDb1PrEr=Q5M__6cmhbbt2iQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 31, 2023 at 4:58 AM Peter Eisentraut <peter(at)eisentraut(dot)org>
wrote:
>
> I want to report on my on-the-plane-to-PGCon project.
>
> The idea was mentioned in [0]. genbki.pl already knows everything about
> system catalog indexes. If we add a "please also make a syscache for
> this one" flag to the catalog metadata, we can have genbki.pl produce
> the tables in syscache.c and syscache.h automatically.
>
> Aside from avoiding the cumbersome editing of those tables, I think this
> layout is also conceptually cleaner, as you can more easily see which
> system catalog indexes have syscaches and maybe ask questions about why
> or why not.

When this has come up before, one objection was that index declarations
shouldn't know about cache names and bucket sizes [1]. The second paragraph
above makes a reasonable case for that, however. I believe one alternative
idea was for a script to read the enum, which would look something like
this:

#define DECLARE_SYSCACHE(cacheid,indexname,numbuckets) cacheid

enum SysCacheIdentifier
{
DECLARE_SYSCACHE(AGGFNOID, pg_aggregate_fnoid_index, 16) = 0,
...
};

...which would then look up the other info in the usual way from Catalog.pm.

> As a possible follow-up, I have also started work on generating the
> ObjectProperty structure in objectaddress.c. One of the things you need
> for that is making genbki.pl aware of the syscache information. There
> is some more work to be done there, but it's looking promising.

I haven't studied this, but it seems interesting.

One other possible improvement: syscache.c has a bunch of #include's, one
for each catalog with a cache, so there's still a bit of manual work in
adding a cache, and the current #include list is a bit cumbersome. Perhaps
it's worth it to have the script emit them as well?

I also wonder if at some point it will make sense to split off a separate
script(s) for some things that are unrelated to the bootstrap data.
genbki.pl is getting pretty large, and there are additional things that
could be done with syscaches, e.g. inlined eq/hash functions for cache
lookup [2].

[1] https://www.postgresql.org/message-id/12460.1570734874@sss.pgh.pa.us
[2]
https://www.postgresql.org/message-id/20210831205906.4wk3s4lvgzkdaqpi%40alap3.anarazel.de

--
John Naylor
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-06-15 07:59:36 Re: Consistent coding for the naming of LR workers
Previous Message Dilip Kumar 2023-06-15 07:44:52 Re: trying again to get incremental backup