Re: Resetting spilled txn statistics in pg_stat_replication

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Resetting spilled txn statistics in pg_stat_replication
Date: 2020-06-23 10:18:31
Message-ID: 20200623101831.it6lzwbm37xwquco@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 23, 2020 at 10:58:18AM +0530, Amit Kapila wrote:
>On Tue, Jun 23, 2020 at 9:32 AM Masahiko Sawada
><masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
>>
>> On Sun, 21 Jun 2020 at 06:57, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>> >
>> > >
>> > >What if the decoding has been performed by multiple backends using the
>> > >same slot? In that case, it will be difficult to make the judgment
>> > >for the value of logical_decoding_work_mem based on stats. It would
>> > >make sense if we provide a way to set logical_decoding_work_mem for a
>> > >slot but not sure if that is better than what we have now.
>> > >
>>
>> I thought that the stats are relevant to what
>> logical_decoding_work_mem value was but not with who performed logical
>> decoding. So even if multiple backends perform logical decoding using
>> the same slot, the user can directly use stats as long as
>> logical_decoding_work_mem value doesn’t change.
>>
>
>I think if you maintain these stats at the slot level, you probably
>need to use spinlock or atomic ops in order to update those as slots
>can be used from multiple backends whereas currently, we don't need
>that.

IMHO storing the stats in the slot itself is a bad idea. We have the
statistics collector for exactly this purpose, and it's receiving data
over UDP without any extra locking etc.
>
>> > >What problems do we see in displaying these for each process? I think
>> > >users might want to see the stats for the exited processes or after
>> > >server restart but I think both of those are not even possible today.
>> > >I think the stats are available till the corresponding WALSender
>> > >process is active.
>>
>> I might want to see the stats for the exited processes or after server
>> restart. But I'm inclined to agree with displaying the stats per
>> process if the stats are displayed on a separate view (e.g.
>> pg_stat_replication_slots).
>>
>
>Yeah, as told previously, this makes more sense to me.
>
>Do you think we should try to write a POC patch using a per-process
>entry approach and see what difficulties we are facing and does it
>give the stats in a way we are imagining but OTOH, we can wait for
>some more to see if there is clear winner approach here?
>

I may be missing something obvious, but I still see no point in tracking
per-process stats. We don't have that for other stats, and I'm not sure
how common is the scenario when a given slot is decoded by many
backends. I'd say vast majority of cases are simply running decoding
from a walsender, which may occasionally restart, but I doubt the users
are interested in per-pid data - they probably want aggregated data.

Can someone explain a plausible scenario for which tracking per-process
stats would be needed, and simply computing deltas would not work? How
will you know which old PID is which, what will you do when a PID is
reused, and so on?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-06-23 10:55:27 Re: TAP tests and symlinks on Windows
Previous Message vignesh C 2020-06-23 10:10:08 Re: [PATCH] Initial progress reporting for COPY command