Re:Re: [BUG] standby node can not provide service even it replays all log files

From: Thunder <thunder1(at)126(dot)com>
To: "Kyotaro Horiguchi" <horikyota(dot)ntt(at)gmail(dot)com>
Cc: robertmhaas(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re:Re: [BUG] standby node can not provide service even it replays all log files
Date: 2019-10-24 09:37:52
Message-ID: 76525c4b.70bb.16dfd212b40.Coremail.thunder1@126.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for replay.I feel confused about snapshot.

At 2019-10-23 11:51:19, "Kyotaro Horiguchi" <horikyota(dot)ntt(at)gmail(dot)com> wrote:
>Hello.
>
>At Tue, 22 Oct 2019 20:42:21 +0800 (CST), Thunder <thunder1(at)126(dot)com> wrote in
>> Update the patch.
>>
>> 1. The STANDBY_SNAPSHOT_PENDING state is set when we replay the first XLOG_RUNNING_XACTS and the sub transaction ids are overflow.
>> 2. When we log XLOG_RUNNING_XACTS in master node, can we assume that all xact IDS < oldestRunningXid are considered finished?
>
>Unfortunately we can't. Standby needs to know that the *standby's*
>oldest active xid exceeds the pendig xmin, not master's. And it is
>already processed in ProcArrayApplyRecoveryInfo. We cannot assume that

>the oldest xids are not same on the both side in a replication pair.

This issue occurs when master does not commit the transaction which has lots of sub transactions, while we restart or create a new standby node.
The standby node can not provide service because of this issue.
Can the standby have any active xid while it can not provide service?

>
>> 3. If we can assume this, when we replay XLOG_RUNNING_XACTS and change standbyState to STANDBY_SNAPSHOT_PENDING, can we record oldestRunningXid to a shared variable, like procArray->oldest_running_xid?
>> 4. In standby node when call GetSnapshotData if procArray->oldest_running_xid is valid, can we set xmin to be procArray->oldest_running_xid?
>>
>> Appreciate any suggestion to this issue.
>
>At 2019-10-22 01:27:58, "Robert Haas" <robertmhaas(at)gmail(dot)com> wrote:
>>On Mon, Oct 21, 2019 at 4:13 AM Thunder <thunder1(at)126(dot)com> wrote:
>..
>> >I think that the issue you've encountered is design behavior. In
>> >other words, it's intended to work that way.
>> >
>> >The comments for the code you propose to change say that we can allow
>> >connections once we've got a valid snapshot. So presumably the effect
>> >of your change would be to allow connections even though we don't have
>> >a valid snapshot.
>> >
>> >That seems bad.
>
>regards.
>
>--
>Kyotaro Horiguchi
>NTT Open Source Software Center
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Arseny Sher 2019-10-24 09:59:30 Re: ERROR: subtransaction logged without previous top-level txn record
Previous Message Ashutosh Sharma 2019-10-24 09:20:12 Re: Zedstore - compressed in-core columnar storage