Re: Tracking cluster upgrade and configuration history

From: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Tracking cluster upgrade and configuration history
Date: 2020-11-16 19:01:50
Message-ID: C575ECBF-B5C6-4863-BA43-4D6F2DEF4142@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Nov 15, 2020, at 10:47 PM, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> On Thu, Nov 12, 2020 at 4:01 AM Mark Dilger
> <mark(dot)dilger(at)enterprisedb(dot)com> wrote:
>>
>> While supporting customers, it would frequently be useful to have more information about the history of a cluster. For example, which prior versions were ever installed and running on the cluster? Has the cluster ever been run with fsync=off? When did the server last enter recovery, if ever? Was a backup_label file present at that time?
>>
>
> +1 for the idea. The information will be useful at times for debugging purposes.

Thanks for the feedback.

>
>>
>> Some of this type of information could strictly be fixed size, such as a fixed set of timestamps for the time at which a fixed set of things last occurred, or a fixed set of bits indicating whether a fixed set of things ever happened.
>>
>> Some other types would be variable size, but hopefully short in practice, like a list of all postgres versions that have ever been run on the cluster.
>>
>> Logging the information via the usual log mechanism seems insufficient, as log files may get rotated and this information lost.
>>
>
> True. Just a thought, can we use existing logging mechanism and APIs
> to write to a new file that never gets rotated by the syslogger(Of
> course, we need to think of the maximum file size that's allowed)? The
> idea is like this: we use elog/ereport and so on with a new debug
> level, when specified, instead of logging into the standard log files,
> we log it to the new file.

That's going in a very different direction from what I had in mind. I was imagining something like a single binary or text file of either fixed size or something very short but not fixed. The "very short but not fixed" part of that seems a bit too hand-waving on reflection. Any variable length list, such as the list of postgres versions started on the cluster, could be made fixed length by only tracking the most recent N of them, perhaps with a flag bit to indicate if the list has overflowed.

Using elog/ereport with a new log level that gets directed into a different log file is an interesting idea, but it is not clear how to use elog/ereport in a principled way to write files that need never get too large.

>> Would it be acceptable to store some fixed set of flag bits and timestamps in pg_control? Space there is at a premium.
>>
>
> Since we allocate ControlFileData in shared memory and also we may
> have some data with timestamps, variable texts and so on, having this
> included in pg_control data structure would not seem a good idea to
> me.

Variable length texts seem completely out of scope for this. I would expect the data to be a collection of integer types and flag bits. Fixed length text might also be possible, but I don't have any examples in mind of text that we'd want to track.

>> Would it make sense to alternately, or additionally, store some of this information in a flat text file in pg_data, say a new file named "cluster_history" or such?
>>
>
> IMHO, this is also a good idea. We need to think of the APIs to
> open/read/write/close that history file? How often and which processes
> and what type of data they write? Is it that the postmaster alone will
> write into that file? If multiple processes are allowed to write, how
> to deal with concurrent writers? Will users have to open manually and
> read that file? or Will we have some program similar to
> pg_controldata? Will we have some maximum limit to the size of this
> file?

This depends in part on feedback about which information others on this list would like to see included, but I was imagining something similar to how pg_control works, or using pg_control itself. The maximum size for pg_control is 512 bytes, and on my system sizeof(ControlFileData) = 296, which leaves 216 bytes free. I didn't check how much that might change on systems with different alignments. We could either use some of the ~200 bytes currently available in pg_control, or use another file, "pg_history" or such, following the design pattern already used for pg_control.


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Dilger 2020-11-16 19:02:08 Re: Tracking cluster upgrade and configuration history
Previous Message Simon Riggs 2020-11-16 18:36:51 Re: remove spurious CREATE INDEX CONCURRENTLY wait