Re: System views for versions reporting

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, Joe Conway <mail(at)joeconway(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: System views for versions reporting
Date: 2025-11-19 20:57:12
Message-ID: a7d6f24f338149a0abfe058d7f5be5bbf4a7cd67.camel@cybertec.at
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, 2025-11-16 at 17:45 +0100, Dmitry Dolgov wrote:
> Better late than never, rebased and dropped 0004.

I think that having a system view like that would be quite useful.
It would be even more useful if it included other libraries that are
optionally linked with PostgreSQL, like OpenSSL, OpenLDAP, lz4,
zstd, GSSAPI etc.

Building the server worked fine on my Linux box (but see my comments
about Glibc below), and the view behaves like it should.

Building the documentation fails:
element indexterm: validity error : IDREFS attribute zone references an unknown ID "view-pg-system-version"
The problem is here:

> --- a/doc/src/sgml/system-views.sgml
> +++ b/doc/src/sgml/system-views.sgml
> + <sect1 id="view-pg-system-versions">
> + <title><structname>pg_system_versions</structname></title>
> +
> + <indexterm zone="view-pg-system-version">
> + <primary>pg_system_versions</primary>
> + </indexterm>

"view-pg-system-version" is missing the "s" at the end.

Comments on the code:
=====================

Patch 0001:
-----------

> --- a/doc/src/sgml/system-views.sgml
> +++ b/doc/src/sgml/system-views.sgml
> + <row>
> + <entry><link linkend="view-pg-system-versions"><structname>pg_system_versions</structname></link></entry>
> + <entry>system versions</entry>
> + </row>
> +
> </tbody>
> </tgroup>
> </table>

That list is alphabetically sorted by function name. You should add the function at the
correct place, not at the end.

The same applies to the next hunk. It gets rendered in its own page, so the order is
not user-visible, but keeping these sections in the same alphabetical order will ease
maintenance.

> + <sect1 id="view-pg-system-versions">
> + <title><structname>pg_system_versions</structname></title>
> +
> + <indexterm zone="view-pg-system-version">
> + <primary>pg_system_versions</primary>
> + </indexterm>
> +
> + <para>
> + The view <structname>pg_system_versions</structname> provides description
> + about versions of various system components, e.g. PostgreSQL itself,
> + compiler used to build it, dependencies, etc.
> + </para>
> +
> [...]
> + <row>
> + <entry role="catalog_table_entry"><para role="column_definition">
> + <structfield>type</structfield> <type>text</type>
> + </para>
> + <para>
> + Component type (compile time or Run time)
> + </para></entry>
> + </row>

I'm not happy with the English in "The view ... provides description about ...",
but all the other views use the same wording, so we should stick with it.
But in "description about versions of various system components", there should
be a "the" before "versions". It should also be "the compiler". Since you start
the list with "e.g.", I think the "etc." is superfluous.

In "compile time or Run time", either use upper case everywhere (except in "or")
or lower case everywhere. I suggest to use upper case like in the actual view
output.

> --- /dev/null
> +++ b/src/backend/utils/misc/system_version.c
> +void
> +add_system_version(const char *name, SystemVersionCB cb, VersionType type)
> +{
> + SystemVersion *hentry;
> + const char *key;
> + bool found;
> +
> + if (!versions)
> + {
> + HASHCTL ctl;
> +
> + ctl.keysize = NAMEDATALEN;
> + ctl.entrysize = sizeof(SystemVersion);
> + ctl.hcxt = CurrentMemoryContext;

That hashtable should stick around for the entire life time of the backend.
Wouldn't it be better to explicitly name the correct memory context, rather
that to rely on being called with "CurrentMemoryContext" set correctly?

> +
> + versions = hash_create("System versions table",
> + MAX_SYSTEM_VERSIONS,
> + &ctl,
> + HASH_ELEM | HASH_STRINGS);
> + }
> +
> + key = pstrdup(name);
> + hentry = (SystemVersion *) hash_search(versions, key,
> + HASH_ENTER, &found);

I think you should at least add an Assert() that makes sure that the key
is not longer than NAMEDATALEN.

> +
> + if (found)
> + elog(ERROR, "duplicated system version");

Is that a problem? Why throw an error?

> +Datum
> +pg_get_system_versions(PG_FUNCTION_ARGS)
> +{
> [...]
> + while ((hentry = (SystemVersion *) hash_seq_search(&status)) != NULL)
> + {
> + Datum values[PG_GET_SYS_VERSIONS_COLS] = {0};
> + bool nulls[PG_GET_SYS_VERSIONS_COLS] = {0};
> + bool available = false;
> + const char *version = hentry->callback(&available);
> +
> + if (!available)
> + continue;
> +
> + values[0] = CStringGetTextDatum(hentry->name);
> + values[1] = CStringGetTextDatum(version);
> + values[2] = hentry->type;
> +
> + tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls);
> + }

I think the assignment for the third column should be

values[2] = Int32GetDatum((int32) hentry->type);

since hentry->type is an enum type and hence an "int".

Patch 0002:
-----------

> --- a/src/backend/utils/misc/system_version.c
> +++ b/src/backend/utils/misc/system_version.c
> +/*
> + * Register versions that describe core components and do not correspond to any
> + * individual component.
> + */
> +void
> +register_core_versions()
> +{
> + add_system_version("Core", core_get_version, CompileTime);
> + add_system_version("Arch", core_get_arch, CompileTime);
> + add_system_version("Compiler", core_get_compiler, CompileTime);
> + add_system_version("ICU", icu_get_version, RunTime);
> + add_system_version("Glibc", glibc_get_version, RunTime);
> +}
> [...]
> +const char *
> +glibc_get_version(bool *available)
> +{
> + *available = true;
> + return (const char *) gnu_get_libc_version();
> +}

Not all operating systems use Glibc, not even all Linux distributions.
If you look at the commitfest entry, you'll see the the CI build fails on
most platforms.

At the very least, this calls for conditional compilation.
Really, there should be an add_system_version() call for all types of
C libraries that exist out there, but that might border on the impossible.

> --- a/src/include/utils/system_version.h
> +++ b/src/include/utils/system_version.h
> @@ -11,6 +11,10 @@
> #ifndef SYSTEM_VERSION_H
> #define SYSTEM_VERSION_H
>
> +#ifdef __GLIBC__
> +#include <gnu/libc-version.h>
> +#endif
> +

That should be included where glibc_get_version() is defined, not here.

> #define MAX_SYSTEM_VERSIONS 100
>
> typedef enum VersionType
> @@ -34,5 +38,13 @@ typedef struct SystemVersion
> } SystemVersion;
>
> void add_system_version(const char *name, SystemVersionCB cb, VersionType type);
> +extern void register_core_versions(void);
> +
> +const char *core_get_version(bool *available);
> +const char *core_get_arch(bool *available);
> +const char *core_get_compiler(bool *available);
> +
> +const char *icu_get_version(bool *available);
> +const char *glibc_get_version(bool *available);
>
> #endif /* SYSTEM_VERSION_H */

I think that these functions had better be "static". Do you expect a need to
call them from other parts of the code?

> --- a/src/test/regress/sql/sysviews.sql
> +++ b/src/test/regress/sql/sysviews.sql
> @@ -101,3 +101,8 @@ select count(distinct utc_offset) >= 24 as ok from pg_timezone_abbrevs;
> -- One specific case we can check without much fear of breakage
> -- is the historical local-mean-time value used for America/Los_Angeles.
> select * from pg_timezone_abbrevs where abbrev = 'LMT';
> +
> +-- 5 core versions should be present: architecture, ICU, core, compiler and
> +-- glibc. If built with JIT support, one more record will be displayed
> +-- containing LLVM version.
> +select count(*) >= 5 as ok FROM pg_system_versions;

If you really think that a regression test for selecting from the view
is useful, use "SELECT EXISTS (TABLE pg_system_versions)".
Are you sure that the regression tests will always run with ICU enabled?

Patch 0003:
-----------

> --- a/src/backend/jit/jit.c
> +++ b/src/backend/jit/jit.c
> +
> +/*
> + * Return JIT provider's version string for troubleshooting purposes.
> + */
> +const char *
> +jit_get_version(bool *available)
> +{
> + if (provider_init())
> + return provider.get_version(available);
> +
> + *available = false;
> + return "";
> +}

I think it would be better to do do the "available" dance here rather than
delegating it to get_version(). get_version() can simply return NULL if there
is no version available.

A more meaningful function comment would be "Callback for add_system_version()".

> +void
> +jit_register_version(void)
> +{
> + add_system_version("LLVM", jit_get_version, RunTime);
> +}

I think that should be in utils/misc/system_version.c.
Then you don't need to #include "utils/system_version.h" in the JIT code.

> --- a/src/include/jit/jit.h
> +++ b/src/include/jit/jit.h
> +/*
> + * Get the provider's version string. The flag indicating availability is
> + * passed as an argument, and will be set accordingly if it's not possible to
> + * get the version.
> + */
> +extern const char *jit_get_version(bool *available);

Again, I think "Callback for add_system_version()" would be a more useful
comment. I have no trouble guessing that the function will return the JIT
version, and if you know add_system_version(), you know what "available" is.

Yours,
Laurenz Albe

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2025-11-19 21:01:03 Re: Speed up COPY FROM text/CSV parsing using SIMD
Previous Message Thomas Munro 2025-11-19 20:49:14 Re: PRI?64 vs Visual Studio (2022)