Re: Slow standby snapshot

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>, Michail Nikolaev <michail(dot)nikolaev(at)gmail(dot)com>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, reshkekirill <reshkekirill(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Slow standby snapshot
Date: 2022-11-16 00:44:48
Message-ID: 20221116004448.om7vtwcklpmnclpw@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-11-15 19:15:15 -0500, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On 2022-11-15 23:14:42 +0000, Simon Riggs wrote:
> >> Hence more frequent compression is effective at reducing the overhead.
> >> But too frequent compression slows down the startup process, which
> >> can't then keep up.
> >> So we're just looking for an optimal frequency of compression for any
> >> given workload.
>
> > What about making the behaviour adaptive based on the amount of wasted effort
> > during those two operations, rather than just a hardcoded "emptiness" factor?
>
> Not quite sure how we could do that, given that those things aren't even
> happening in the same process.

I'm not certain what the best approach is, but I don't think the
not-the-same-process part is a blocker.

Approach 1:

We could have an atomic variable in ProcArrayStruct that counts the amount of
wasted effort and have processes update it whenever they've wasted a
meaningful amount of effort. Something like counting the skipped elements in
KnownAssignedXidsGetAndSetXmin in a function local static variable and
updating the shared counter whenever that reaches

Approach 2:

Perform conditional cleanup in non-startup processes - I think that'd actually
be ok, as long as ProcArrayLock is held exlusively. We could count the amount
of skipped elements in KnownAssignedXidsGetAndSetXmin() in a local variable,
and whenever that gets too high, conditionally acquire ProcArrayLock lock
exlusively at the end of GetSnapshotData() and compress KAX. Reset the local
variable independent of getting the lock or not, to avoid causing a lot of
contention.

The nice part is that this would work even without the startup making
process. The not nice part that it'd require a bit of code study to figure out
whether it's safe to modify KAX from outside the startup process.

> But yeah, it does feel like the proposed
> approach is only going to be optimal over a small range of conditions.

In particular, it doesn't adapt at all to workloads that don't replay all that
much, but do compute a lot of snapshots.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-11-16 00:46:26 Re: Slow standby snapshot
Previous Message Andres Freund 2022-11-16 00:31:43 Re: Slow standby snapshot