Re: Operation log for major operations

From: Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru>
To: Kirk Wolak <wolakk(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Operation log for major operations
Date: 2023-03-13 21:36:14
Message-ID: b7a1dee3-a4f2-b8d6-a020-a0584c40902f@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Kirk, I'm sorry about the long pause in my reply.

>We need some kind of semaphore flag that tells us something awkward
>happened. When it happened, and a little bit of extra information.

I agree that we do not have this kind of information.
Additionally, legal events like start of pg_rewind, pg_reset, ... are
interesting.

>You also make the point that if such things have happened, it would
>probably be a good idea to NOT allow pg_upgrade to run.
>It might even be a reason to constantly bother someone until
>the issue is repaired.

I think no reason to forbid the run of pg_upgrade for the user
(especially in automatic mode).
If we automatically do NOT allow pg_upgrade, what should the user do for
allow pg_upgrade?
Unfortunately, PostgreSQL does not have the utilities to correct errors
in the database (in case of errors users uses copies of the DB or
corrects errors manually).
An ordinary user cannot correct errors on his own ...
So we cannot REQUIRE the user to correct database errors, we can only
INFORM about them.

>To that point, this feels like a "postgresql_panic.log" file (within
>the postgresql files?)... Something that would prevent pg_upgrade,
>etc. That everyone recognizes is serious. Especially 3rd party vendors.
>I see the need for such a thing. I have to agree with others about
>questioning the proper place to write this.
>Are there other places that make sense, that you could use, especially
>if knowing it exists means there was a serious issue?

The location of the operation log (like a "postgresql_panic.log") is not
easy question.
Our technical support is sure that the large number of failures are
caused by "human factor" (actions of database administrators).
It is not difficult for database administrators to delete the
"postgresql_panic.log" file or edit it (for example, replacing it with
an old version; CRC will not save you from such an action).

Therefore, our technical support decided to place the operation log at
the end of the pg_control file, at an offset of 8192 bytes (and protect
this log with CRC).
About writing to the pg_control file what worries Tom Lane: information
in pg_control is written once at system startup (twice in case of
"promote").
Also, some utilities write information to the operation log too -
pg_resetwal, pg_rewind, pg_upgrade (these utilities also modify the
pg_control file without the operation log).

If you are interested, I can attach the current patch (for info - I
think it makes no sense to offer this patch at commitfest).

--
With best regards,
Dmitry Koval

Postgres Professional: http://postgrespro.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2023-03-13 21:40:27 Re: optimize several list functions with SIMD intrinsics
Previous Message David Zhang 2023-03-13 21:30:55 Re: [BUG] pg_stat_statements and extended query protocol