Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Lukas Fittl <lukas(at)fittl(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)
Date: 2021-12-03 20:02:24
Message-ID: CAAKRu_aaq33UnG4TXq3S-OSXGWj1QGf0sU+ECH4tNwGFNERkZA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks again! I really appreciate the thorough review.

I have combined responses to all three of your emails below.
Let me know if it is more confusing to do it this way.

On Wed, Dec 1, 2021 at 6:59 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
>
> On Wed, Dec 01, 2021 at 05:00:14PM -0500, Melanie Plageman wrote:
> > > Also:
> > > src/include/miscadmin.h:#define BACKEND_NUM_TYPES (B_LOGGER + 1)
> > >
> > > I think it's wrong to say NUM_TYPES = B_LOGGER + 1 (which would suggest using
> > > lessthan-or-equal instead of lessthan as you are).
> > >
> > > Since the valid backend types start at 1 , the "count" of backend types is
> > > currently B_LOGGER (13) - not 14. I think you should remove the "+1" here.
> > > Then NROWS (if it continued to exist at all) wouldn't need to subtract one.
> >
> > I think what I currently have is technically correct because I start at
> > 1 when I am using it as a loop condition. I do waste a spot in the
> > arrays I allocate with BACKEND_NUM_TYPES size.
> >
> > I was hesitant to make the value of BACKEND_NUM_TYPES == B_LOGGER
> > because it seems kind of weird to have it have the same value as the
> > B_LOGGER enum.
>
> I don't mean to say that the code is misbehaving - I mean "num_x" means "the
> number of x's" - how many there are. Since the first, valid backend type is 1,
> and they're numbered consecutively and without duplicates, then "the number of
> backend types" is the same as the value of the last one (B_LOGGER). It's
> confusing if there's a macro called BACKEND_NUM_TYPES which is greater than the
> number of backend types.
>
> Most loops say for (int i=0; i<NUM; ++i)
> If it's 1-based, they say for (int i=1; i<=NUM; ++i)
> You have two different loops like:
>
> + for (int i = 0; i < BACKEND_NUM_TYPES - 1 ; i++)
> + for (int backend_type = 1; backend_type < BACKEND_NUM_TYPES; backend_type++)
>
> Both of these iterate over the correct number of backend types, but they both
> *look* wrong, which isn't desirable.

I've changed this and added comments wherever I could to make it clear.
Whenever the parameter was of type BackendType, I tried to use the
correct (not adjusted by subtracting 1) number and wherever the type was
int and being used as an index into the array, I used the adjusted value
and added the idx suffix to make it clear that the number does not
reflect the actual BackendType:

On Wed, Dec 1, 2021 at 10:31 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
>
> On Wed, Dec 01, 2021 at 04:59:44PM -0500, Melanie Plageman wrote:
> > Thanks for the review!
> >
> > On Wed, Nov 24, 2021 at 8:16 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> > > There's extraneous blank lines in these functions:
> > > +pgstat_recv_resetsharedcounter
> > I didn't see one here
>
> => The extra blank line is after the RESET_BUFFERS memset.

Fixed.

> > + * Reset the global, bgwriter and checkpointer statistics for the
> > + * cluster.
>
> The first comma in this comment was introduced in 1bc8e7b09, and seems to be
> extraneous, since bgwriter and checkpointer are both global. With the comma,
> it looks like it should be memsetting 3 things.

Fixed.

> > + /* Don't count dead backends. They should already be counted */
>
> Maybe this comment should say ".. they'll be added below"

Fixed.

> > + row[COLUMN_BACKEND_TYPE] = backend_type_desc;
> > + row[COLUMN_IO_PATH] = CStringGetTextDatum(GetIOPathDesc(io_path));
> > + row[COLUMN_ALLOCS] += io_ops->allocs - resets->allocs;
> > + row[COLUMN_EXTENDS] += io_ops->extends - resets->extends;
> > + row[COLUMN_FSYNCS] += io_ops->fsyncs - resets->fsyncs;
> > + row[COLUMN_WRITES] += io_ops->writes - resets->writes;
> > + row[COLUMN_RESET_TIME] = reset_time;
>
> It'd be clearer if RESET_TIME were set adjacent to BACKEND_TYPE and IO_PATH.

If you mean just in the order here (not in the column order in the
view), then I have changed it as you recommended.

> This message needs to be updated:
> errhint("Target must be \"archiver\", \"bgwriter\", or \"wal\".")))

Done.

> When I query the view, I see reset times as: 1999-12-31 18:00:00-06.
> I guess it should be initialized like this one:
> globalStats.bgwriter.stat_reset_timestamp = ts

Done.

> The cfbot shows failures now (I thought it was passing with the previous patch,
> but I suppose I'm wrong.)
>
> It looks like running recovery during single user mode hits this assertion.
> TRAP: FailedAssertion("beentry", File: "../../../../src/include/utils/backend_status.h", Line: 359, PID: 3499)
>

Yes, thank you for catching this.
I have moved up pgstat_beinit and pgstat_bestart so that single user
mode process will also have PgBackendStatus. I also have to guard
against sending these stats to the collector since there is no room for
B_INVALID backendtype in the array of IO Op values.

With this change `make check-world` passes on my machine.

On Wed, Dec 1, 2021 at 11:06 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
>
> On Wed, Dec 01, 2021 at 04:59:44PM -0500, Melanie Plageman wrote:
> > Thanks for the review!
> >
> > On Wed, Nov 24, 2021 at 8:16 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> > > You wrote beentry++ at the start of two loops, but I think that's wrong; it
> > > should be at the end, as in the rest of the file (or as a loop increment).
> > > BackendStatusArray[0] is actually used (even though its backend has
> > > backendId==1, not 0). "MyBEEntry = &BackendStatusArray[MyBackendId - 1];"
> >
> > I've fixed this in v16 which I will attach to the next email in the thread.
>
> I just noticed that since beentry++ is now at the end of the loop, it's being
> missed when you "continue":
>
> + if (beentry->st_procpid == 0)
> + continue;

Fixed.

> Also, I saw that pgindent messed up and added spaces after pointers in function
> declarations, due to new typedefs not in typedefs.list:
>
> -pgstat_send_buffers_reset(PgStat_MsgResetsharedcounter *msg)
> +pgstat_send_buffers_reset(PgStat_MsgResetsharedcounter * msg)
>
> -static inline void pg_atomic_inc_counter(pg_atomic_uint64 *counter)
> +static inline void
> +pg_atomic_inc_counter(pg_atomic_uint64 * counter)

Fixed.

-- Melanie

Attachment Content-Type Size
v17-0007-small-comment-correction.patch application/octet-stream 909 bytes
v17-0006-Remove-superfluous-bgwriter-stats.patch application/octet-stream 15.6 KB
v17-0005-Add-system-view-tracking-IO-ops-per-backend-type.patch application/octet-stream 17.6 KB
v17-0003-Send-IO-operations-to-stats-collector.patch application/octet-stream 9.2 KB
v17-0004-Add-buffers-to-pgstat_reset_shared_counters.patch application/octet-stream 9.5 KB
v17-0002-Add-IO-operation-counters-to-PgBackendStatus.patch application/octet-stream 17.2 KB
v17-0001-Read-only-atomic-backend-write-function.patch application/octet-stream 1.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dag Lem 2021-12-03 20:07:29 daitch_mokotoff module
Previous Message Tom Lane 2021-12-03 19:42:11 Re: The "char" type versus non-ASCII characters