Re: generate syscache info automatically

From: Peter Eisentraut <peter(at)eisentraut(dot)org>
To: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: generate syscache info automatically
Date: 2023-08-31 11:23:00
Message-ID: 9aea7c02-ca6f-4cf0-0e5a-bcc0fd959b7c@eisentraut.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I have committed 0002 and 0003, and also a small bug fix in the
ObjectProperty entry for "transforms".

I have also gotten the automatic generation of the ObjectProperty lookup
table working (with some warts).

Attached is an updated patch set.

One win here is that the ObjectProperty lookup now uses a hash function
instead of a linear search. So hopefully the performance is better (to
be confirmed) and it could now be used for things like [0].

[0]:
https://www.postgresql.org/message-id/flat/12f1b1d2-f8cf-c4a2-72ec-441bd79546cb(at)enterprisedb(dot)com

There was some discussion about whether the catalog files are the right
place to keep syscache information. Personally, I would find that more
convenient. (Note that we also recently moved the definitions of
indexes and toast tables from files with the whole list into the various
catalog files.) But we can also keep it somewhere else. The important
thing is that Catalog.pm can find it somewhere in a structured form.

To finish up the ObjectProperty generation, we also need to store some
more data, namely the OBJECT_* mappings. We probably do not want to
store those in the catalog headers; that looks like a layering violation
to me.

So, there is some discussion to be had about where to put all this
across various use cases.

On 24.08.23 16:03, Peter Eisentraut wrote:
> On 03.07.23 07:45, Peter Eisentraut wrote:
>> Here is an updated patch set that adjusts for the recently introduced
>> named captures.  I will address the other issues later.  I think the
>> first few patches in the series can be considered nonetheless.
>
> I have committed the 0001 patch, which was really a (code comment) bug fix.
>
> I think the patches 0002 and 0003 should be uncontroversial, so I'd like
> to commit them if no one objects.
>
> The remaining patches are still WIP.
>
>
>> On 15.06.23 09:45, John Naylor wrote:
>>> On Wed, May 31, 2023 at 4:58 AM Peter Eisentraut
>>> <peter(at)eisentraut(dot)org <mailto:peter(at)eisentraut(dot)org>> wrote:
>>>  >
>>>  > I want to report on my on-the-plane-to-PGCon project.
>>>  >
>>>  > The idea was mentioned in [0]. genbki.pl <http://genbki.pl>
>>> already knows everything about
>>>  > system catalog indexes.  If we add a "please also make a syscache for
>>>  > this one" flag to the catalog metadata, we can have genbki.pl
>>> <http://genbki.pl> produce
>>>  > the tables in syscache.c and syscache.h automatically.
>>>  >
>>>  > Aside from avoiding the cumbersome editing of those tables, I
>>> think this
>>>  > layout is also conceptually cleaner, as you can more easily see which
>>>  > system catalog indexes have syscaches and maybe ask questions
>>> about why
>>>  > or why not.
>>>
>>> When this has come up before, one objection was that index
>>> declarations shouldn't know about cache names and bucket sizes [1].
>>> The second paragraph above makes a reasonable case for that, however.
>>> I believe one alternative idea was for a script to read the enum,
>>> which would look something like this:
>>>
>>> #define DECLARE_SYSCACHE(cacheid,indexname,numbuckets) cacheid
>>>
>>> enum SysCacheIdentifier
>>> {
>>> DECLARE_SYSCACHE(AGGFNOID, pg_aggregate_fnoid_index, 16) = 0,
>>> ...
>>> };
>>>
>>> ...which would then look up the other info in the usual way from
>>> Catalog.pm.
>>>
>>>  > As a possible follow-up, I have also started work on generating the
>>>  > ObjectProperty structure in objectaddress.c.  One of the things
>>> you need
>>>  > for that is making genbki.pl <http://genbki.pl> aware of the
>>> syscache information.  There
>>>  > is some more work to be done there, but it's looking promising.
>>>
>>> I haven't studied this, but it seems interesting.
>>>
>>> One other possible improvement: syscache.c has a bunch of #include's,
>>> one for each catalog with a cache, so there's still a bit of manual
>>> work in adding a cache, and the current #include list is a bit
>>> cumbersome. Perhaps it's worth it to have the script emit them as well?
>>>
>>> I also wonder if at some point it will make sense to split off a
>>> separate script(s) for some things that are unrelated to the
>>> bootstrap data. genbki.pl <http://genbki.pl> is getting pretty large,
>>> and there are additional things that could be done with syscaches,
>>> e.g. inlined eq/hash functions for cache lookup [2].
>>>
>>> [1]
>>> https://www.postgresql.org/message-id/12460.1570734874@sss.pgh.pa.us
>>> <https://www.postgresql.org/message-id/12460.1570734874@sss.pgh.pa.us>
>>> [2]
>>> https://www.postgresql.org/message-id/20210831205906.4wk3s4lvgzkdaqpi%40alap3.anarazel.de <https://www.postgresql.org/message-id/20210831205906.4wk3s4lvgzkdaqpi%40alap3.anarazel.de>
>>>
>>> --
>>> John Naylor
>>> EDB: http://www.enterprisedb.com <http://www.enterprisedb.com>
>
>
>

Attachment Content-Type Size
v3-0001-Fill-in-more-of-ObjectProperty.patch text/plain 2.5 KB
v3-0002-Generate-syscache-info-from-catalog-files.patch text/plain 73.9 KB
v3-0003-Generate-ObjectProperty-from-catalog-files.patch text/plain 14.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Aleksander Alekseev 2023-08-31 11:34:26 Re: Commitfest 2023-09 starts soon
Previous Message Aleksander Alekseev 2023-08-31 11:18:44 Re: Commitfest 2023-09 starts soon