Re: make async slave to wait for lsn to be replayed

From: Thom Brown <thom(at)linux(dot)com>
To: Ivan Kartyshov <i(dot)kartyshov(at)postgrespro(dot)ru>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: make async slave to wait for lsn to be replayed
Date: 2017-02-22 14:08:04
Message-ID: CAA-aLv7LF=Rgu5x8B-krCcMBQ4DneGPm4UY+1uzLPKkKU0TxKg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 23 January 2017 at 11:56, Ivan Kartyshov <i(dot)kartyshov(at)postgrespro(dot)ru> wrote:
> Thank you for reviews and suggested improvements.
> I rewrote patch to make it more stable.
>
> Changes
> =======
> I've made a few changes:
> 1) WAITLSN now doesn`t depend on snapshot
> 2) Check current replayed LSN rather than in xact_redo_commit
> 3) Add syntax WAITLSN_INFINITE '0/693FF800' - for infinite wait and
> WAITLSN_NO_WAIT '0/693FF800' for check if LSN was replayed as you
> advised.
> 4) Reduce the count of loops with GUCs (WalRcvForceReply() which in 9.5
> doesn`t exist).
> 5) Optimize loop that set latches.
> 6) Add two GUCs that helps us to configure influence on StartupXLOG:
> count_waitlsn (denominator to check not each LSN)
> interval_waitlsn (Interval in milliseconds to additional LSN check)
>
> Feedback
> ========
> On 09/15/2016 05:41 AM, Thomas Munro wrote:
>>
>> You hold a spinlock in one arbitrary slot, but that
>> doesn't seem sufficient: another backend may also read it, compute a
>> new value and then write it, while holding a different spin lock. Or
>> am I missing something?
>
>
> We acquire an individual spinlock on each member of array, so you cannot
> compute new value and write it concurrently.
>
> Tested
> ======
> We have been tested it on different servers and OS`s, in different cases and
> workloads. New version is nearly as fast as vanilla on primary and bring
> tiny influence on standby performance.
>
> Hardware:
> 144 Intel Cores with HT
> 3TB RAM
> all data on ramdisk
> primary + hotstandby on the same node.
>
> A dataset was created with "pgbench -i -s 1000" command. For each round of
> test we pause replay on standby, make 1000000 transaction on primary with
> pgbench, start replay on standby and measure replication gap disappearing
> time under different standby workload. The workload was "WAITLSN
> ('Very/FarLSN', 1000ms timeout)" followed by "select abalance from
> pgbench_accounts there aid = random_aid;"
> For vanilla 1000ms timeout was enforced on pgbench side by -R option.
> GUC waitlsn parameters was adopted for 1000ms timeout on standby with 35000
> tps rate on primary.
> interval_waitlsn = 500 (ms)
> count_waitlsn = 30000
>
> On 200 clients, slave caching up master as vanilla without significant
> delay.
> On 500 clients, slave caching up master 3% slower then vanilla.
> On 1000 clients, 12% slower.
> On 5000 clients, 3 time slower because it far above our hardware ability.
>
> How to use it
> ==========
> WAITLSN ‘LSN’ [, timeout in ms];
> WAITLSN_INFINITE ‘LSN’;
> WAITLSN_NO_WAIT ‘LSN’;
>
> #Wait until LSN 0/303EC60 will be replayed, or 10 second passed.
> WAITLSN ‘0/303EC60’, 10000;
>
> #Or same without timeout.
> WAITLSN ‘0/303EC60’;
> orfile:///home/vis/Downloads/waitlsn_10dev_v2.patch
> WAITLSN_INFINITE '0/693FF800';
>
> #To check if LSN is replayed can be used.
> WAITLSN_NO_WAIT '0/693FF800';
>
> Notice: WAITLSN will release on PostmasterDeath or Interruption events
> if they come earlier then target LSN or timeout.
>
>
> Thank you for reading, will be glad to get your feedback.

Could you please rebase your patch as it no longer applies cleanly.

Thanks

Thom

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2017-02-22 14:16:28 remove deprecated COMMENT ON RULE syntax
Previous Message Peter Eisentraut 2017-02-22 14:00:07 Re: Change in "policy" on dump ordering?