Re: warn if GUC set to an invalid shared library

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Maciek Sakrejda <m(dot)sakrejda(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: warn if GUC set to an invalid shared library
Date: 2022-01-27 22:01:03
Message-ID: 20220127220103.GY23027@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jan 09, 2022 at 11:58:18AM -0800, Maciek Sakrejda wrote:
> On Sat, Jan 8, 2022 at 2:07 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> > Unfortunately, the output for dlopen() is not portable, which (I think) means
> > most of what I wrote can't be made to work.. Since it doesn't work to call
> > dlopen() when using SET, I tried using just stat(). But that also fails on
> > windows, since one of the regression tests has an invalid filename involving
> > unbalanced quotes, which cause it to return EINVAL rather than ENOENT. So SET
> > cannot warn portably, unless it includes no details at all (or we specially
> > handle the windows case), or change the pre-existing regression test. But
> > there's a 2nd instability, too, apparently having to do with timing. So I'm
> > planning to drop the 0001 patch.
>
> Hmm. I think 001 is a big part of the usability improvement here.

I agree - it helps people avoid causing a disruption, rather than just helping
them to fix it faster.

> Could we not at least warn generically, without relaying the
> underlying error? The notable thing in this situation is that the
> specified library could not be loaded (and that it will almost
> certainly cause problems on restart). The specific error would be nice
> to have, but it's less important. What is the timing instability?

I saw regression diffs like this, showing that the warning could be displayed
before or after the SELECT was echoed.

https://cirrus-ci.com/task/6301672321318912
-SELECT * FROM schema4.counted;
WARNING: could not load library: $libdir/plugins/worker_spi: cannot open shared object file: No such file or directory
+SELECT * FROM schema4.counted;

It's certainly possible to show a static message without additional text from
errno.

On Tue, Dec 28, 2021 at 9:45 AM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> > > For whatever reason, I get slightly different (and somewhat redundant)
> > > output on failing to start:
> > >
> > > 2022-01-08 12:59:36.784 PST [324482] WARNING: could not load library: $libdir/plugins/totally bogus: cannot open shared object file: No such file or directory
> > > 2022-01-08 12:59:36.787 PST [324482] FATAL: could not load library: totally bogus: cannot open shared object file: No such file or directory
> > > 2022-01-08 12:59:36.787 PST [324482] LOG: database system is shut down
> >
> > I think the first WARNING is from the GUC mechanism "setting" the library.
> > And then the FATAL is from trying to apply the GUC.
> > It looks like you didn't apply the 0002 patch for that test so got no CONTEXT ?
>
> I still had the terminal open where I tested this, and the scrollback
> did show me applying the patch (and building after). I tried a make
> clean and applying the patch again, and I do see the CONTEXT line now.
> I'm not sure what the problem was but seems like PEBKAC--sorry about
> that.

Maybe you missed "make install" or similar.

I took the liberty of adding you as a reviewer here:
https://commitfest.postgresql.org/36/3482/

--
Justin

Attachment Content-Type Size
v3-0001-errcontext-if-server-fails-to-start-due-to-librar.patch text/x-diff 4.2 KB
v3-0002-warn-when-setting-GUC-to-a-nonextant-library.patch text/x-diff 8.0 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-01-27 22:03:51 Re: A test for replay of regression tests
Previous Message Peter Geoghegan 2022-01-27 21:59:38 Re: Why is INSERT-driven autovacuuming based on pg_class.reltuples?