Re: MultiXact\SLRU buffers configuration

From: "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MultiXact\SLRU buffers configuration
Date: 2020-05-15 09:01:46
Message-ID: E29C4184-23B3-4AEF-8D82-34C0A7F7224B@yandex-team.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> 15 мая 2020 г., в 05:03, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> написал(а):
>
> At Thu, 14 May 2020 11:44:01 +0500, "Andrey M. Borodin" <x4mmm(at)yandex-team(dot)ru> wrote in
>>> GetMultiXactIdMembers believes that 4 is successfully done if 2
>>> returned valid offset, but actually that is not obvious.
>>>
>>> If we add a single giant lock just to isolate ,say,
>>> GetMultiXactIdMember and RecordNewMultiXact, it reduces concurrency
>>> unnecessarily. Perhaps we need finer-grained locking-key for standby
>>> that works similary to buffer lock on primary, that doesn't cause
>>> confilicts between irrelevant mxids.
>>>
>> We can just replay members before offsets. If offset is already there - members are there too.
>> But I'd be happy if we could mitigate those 1000us too - with a hint about last maixd state in a shared MX state, for example.
>
> Generally in such cases, condition variables would work. In the
> attached PoC, the reader side gets no penalty in the "likely" code
> path. The writer side always calls ConditionVariableBroadcast but the
> waiter list is empty in almost all cases. But I couldn't cause the
> situation where the sleep 1000u is reached.
Thanks! That really looks like a good solution without magic timeouts. Beautiful!
I think I can create temporary extension which calls MultiXact API and tests edge-cases like this 1000us wait.
This extension will also be also useful for me to assess impact of bigger buffers, reduced read locking (as in my 2nd patch) and other tweaks.

>> Actually, if we read empty mxid array instead of something that is replayed just yet - it's not a problem of inconsistency, because transaction in this mxid could not commit before we started. ISTM.
>> So instead of fix, we, probably, can just add a comment. If this reasoning is correct.
>
> The step 4 of the reader side reads the members of the target mxid. It
> is already written if the offset of the *next* mxid is filled-in.
Most often - yes, but members are not guaranteed to be filled in order. Those who win MXMemberControlLock will write first.
But nobody can read members of MXID before it is returned. And its members will be written before returning MXID.

Best regards, Andrey Borodin.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-05-15 09:08:17 pg_stat_wal_receiver and flushedUpto/writtenUpto
Previous Message Atsushi Torikoshi 2020-05-15 08:47:41 Re: Is it useful to record whether plans are generic or custom?