Re: Tracking last scan time

From: Andres Freund <andres(at)anarazel(dot)de>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: Dave Page <dpage(at)pgadmin(dot)org>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Tracking last scan time
Date: 2022-09-01 18:35:40
Message-ID: 20220901183540.cyfje2lsiouj4qi4@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-09-01 14:18:42 +0200, Matthias van de Meent wrote:
> On Wed, 31 Aug 2022 at 20:56, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > But given this is done when stats are flushed, which only happens after the
> > transaction ended, we can just use GetCurrentTransactionStopTimestamp() - if
> > we got to flushing the transaction stats we'll already have computed that.
>
> I'm not entirely happy with that, as that would still add function
> call overhead, and potentially still call GetCurrentTimestamp() in
> this somewhat hot loop.

We already used GetCurrentTransactionStopTimestamp() (as you reference below)
before we get to this point, so I doubt that we'll ever call
GetCurrentTimestamp(). And it's hard to imagine that the function call
overhead of GetCurrentTransactionStopTimestamp() matters compared to acquiring
locks etc.

> As an alternative, we could wire the `now` variable in
> pgstat_report_stat (generated from
> GetCurrentTransactionStopTimestamp() into pgstat_flush_pending_entries
> and then into flush_pending_cb (or, store this in a static variable)
> so that we can reuse that value, saving any potential function call
> overhead.

Passing it in doesn't clearly seem an improvement, but I also don't have a
strong opinion on it. I am strongly against the static variable approach.

> > > tabentry->numscans += lstats->t_counts.t_numscans;
> > > + if (pgstat_track_scans && lstats->t_counts.t_numscans)
> > > + tabentry->lastscan = GetCurrentTimestamp();
> >
> > Besides replacing GetCurrentTimestamp() with
> > GetCurrentTransactionStopTimestamp(), this should then also check if
> > tabentry->lastscan is already newer than the new timestamp.
>
> I wonder how important that is. This value only gets set in a stats
> flush, which may skew the stat update by several seconds (up to
> PGSTAT_MAX_INTERVAL). I don't expect concurrent flushes to take so
> long that it will set the values to It is possible, but I think it is
> extremely unlikely that this is going to be important when you
> consider that these stat flushes are not expected to run for more than
> 1 second.

I think it'll be confusing if you have values going back and forth, even if
just by a little. And it's cheap to defend against, so why not just do that?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2022-09-01 18:39:11 Re: Handle infinite recursion in logical replication setup
Previous Message Pavel Stehule 2022-09-01 18:17:31 Re: Schema variables - new implementation for Postgres 15