Re: generate syscache info automatically

From: Peter Eisentraut <peter(at)eisentraut(dot)org>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
Subject: Re: generate syscache info automatically
Date: 2023-11-01 21:13:03
Message-ID: ec503bf6-e2c0-4dff-8261-b3f6e5922c46@eisentraut.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here is a rebased patch set, along with a summary of the questions I
have about these patches:

v4-0001-Generate-syscache-info-from-catalog-files.patch

What's a good syntax to declare a syscache? Currently, I have, for example

-DECLARE_UNIQUE_INDEX_PKEY(pg_type_oid_index, 2703, TypeOidIndexId,
pg_type, btree(oid oid_ops));
+DECLARE_UNIQUE_INDEX_PKEY(pg_type_oid_index, 2703, TypeOidIndexId,
pg_type, btree(oid oid_ops), TYPEOID, 64);

I suppose a sensible alternative could be to leave the
DECLARE_..._INDEX... alone and make a separate statement, like

MAKE_SYSCACHE(pg_type_oid_index, TYPEOID, 64);

That's at least visually easier, because some of those
DECLARE_... lines are getting pretty long.

I would like to keep those MAKE_SYSCACHE lines in the catalog files.
That just makes it easier to reference everything together.

v4-0002-Generate-ObjectProperty-from-catalog-files.patch

Several questions here:

* What's a good way to declare the mapping between catalog and object
type?

* How to select which catalogs have ObjectProperty entries generated?

* Where to declare class descriptions? Or just keep the current hack in
the patch.

* How to declare the purpose of a catalog column, like "this is the ACL
column for this catalog". This is currently done by name, but maybe
it should be more explicit.

* Question about how to pick the correct value for is_nsp_name_unique?

This second patch is clearly still WIP. I hope the first patch can be
finished soon, however.

I was also amused to find the original commit fa351d5a0db that
introduced ObjectProperty, which contains the following comment:

Performance isn't critical here, so there's no need to be smart
about how we do the search.

which I'm now trying to amend.

On 11.09.23 07:02, John Naylor wrote:
>
> On Thu, Aug 31, 2023 at 6:23 PM Peter Eisentraut <peter(at)eisentraut(dot)org
> <mailto:peter(at)eisentraut(dot)org>> wrote:
>
> > Attached is an updated patch set.
>
> > There was some discussion about whether the catalog files are the right
> > place to keep syscache information.  Personally, I would find that more
> > convenient.  (Note that we also recently moved the definitions of
> > indexes and toast tables from files with the whole list into the various
> > catalog files.)  But we can also keep it somewhere else.  The important
> > thing is that Catalog.pm can find it somewhere in a structured form.
>
> I don't have anything further to say on how to fit it together, but I'll
> go ahead share some other comments from a quick read of v3-0003:
>
> + # XXX hardcoded exceptions
> + # extension doesn't belong to extnamespace
> + $attnum_namespace = undef if $catname eq 'pg_extension';
> + # pg_database owner is spelled datdba
> + $attnum_owner = "Anum_pg_database_datdba" if $catname eq 'pg_database';
>
> These exceptions seem like true exceptions...
>
> + # XXX?
> + $name_syscache = "SUBSCRIPTIONNAME" if $catname eq 'pg_subscription';
> + # XXX?
> + $is_nsp_name_unique = 1 if $catname eq 'pg_collation';
> + $is_nsp_name_unique = 1 if $catname eq 'pg_opclass';
> + $is_nsp_name_unique = 1 if $catname eq 'pg_opfamily';
> + $is_nsp_name_unique = 1 if $catname eq 'pg_subscription';
>
> ... but as the XXX conveys, these indicate a failure to do the right
> thing. Trying to derive this info from the index declarations is
> currently fragile. E.g. making one small change to this regex:
>
> -               if ($index->{index_decl} =~ /\(\w+name name_ops(,
> \w+namespace oid_ops)?\)/)
> +               if ($index->{index_decl} =~ /\w+name name_ops(,
> \w+namespace oid_ops)?\)/)
>
> ...causes "is_nsp_name_unique" to flip from false to true, and/or sets
> "name_catcache_id" where it wasn't before, for several entries. It's not
> quite clear what the "rule" is intended to be, or whether it's robust
> going forward.
>
> That being the case, perhaps some effort should also be made to make it
> easy to verify the output matches HEAD (or rather, v3-0001). This is now
> difficult because of the C99 ".foo = bar" syntax, plus the alphabetical
> ordering.
>
> +foreach my $catname (@catnames)
> +{
> + print $objectproperty_info_fh qq{#include "catalog/${catname}_d.h"\n};
> +}
>
> Assuming we have a better way to know which catalogs need
> object properties, is there a todo to only #include those?
>
> > To finish up the ObjectProperty generation, we also need to store some
> > more data, namely the OBJECT_* mappings.  We probably do not want to
> > store those in the catalog headers; that looks like a layering violation
> > to me.
>
> Possibly, but it's also convenient and, one could argue, more
> straightforward than storing syscache id and num_buckets in the index info.
>
> One last thing: I did try to make the hash function use uint16 for the
> key (to reduce loop iterations on nul bytes), and that didn't help, so
> we are left with the ideas I mentioned earlier.
>
> (not changing CF status, because nothing specific is really required to
> change at the moment, some things up in the air)
>
> --
> John Naylor
> EDB: http://www.enterprisedb.com <http://www.enterprisedb.com>

Attachment Content-Type Size
v4-0001-Generate-syscache-info-from-catalog-files.patch text/plain 74.0 KB
v4-0002-Generate-ObjectProperty-from-catalog-files.patch text/plain 14.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Morris 2023-11-01 21:15:20 Re: Atomic ops for unlogged LSN
Previous Message Laurenz Albe 2023-11-01 20:46:52 Re: GUC names in messages