Re: Reducing ClogControlLock contention

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Reducing ClogControlLock contention
Date: 2015-08-31 11:34:59
Message-ID: CAA4eK1L8a6CA_Pjb-Ke1+JzE9yvx_aaowMKt5j6Kn9YumwkpYQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 26, 2015 at 4:18 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
> On 26 August 2015 at 11:40, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>
>> On Tue, Aug 25, 2015 at 2:21 PM, Simon Riggs <simon(at)2ndquadrant(dot)com>
wrote:
>>>
>>> On 22 August 2015 at 15:14, Andres Freund <andres(at)anarazel(dot)de> wrote:
>>
>>
>>>>
>>>> TransactionIdSetPageStatus() calls TransactionIdSetStatusBit(), which
>>>> writes an 8 byte variable (the lsn). That's not safe.
>>>
>>>
>>> Agreed, thanks for spotting that.
>>>
>>> I see this as the biggest problem to overcome with this patch.
>>
>>
>> How about using 64bit atomics or spinlock to protect this variable?
>
>
> Spinlock is out IMHO because this happens on every clog lookup. So it
must be an atomic read.
>

Agreed, however using atomics is still an option, yet another way could
be before updating group_lsn, check if we already have CLogControlLock
in Exclusive mode then update it, else release the lock, re-acquire in
Exclusive mode and update the variable. The first thing that comes to mind
with this idea is that it will be less performant, yes thats true, but it
will be
only done for asynchronous commits (mode which is generally not recommended
for production-use) and that too not on every commit, so may be the impact
is
not that high. I have updated the patch (attached with mail) to show
you what
I have in mind.

Another point about the latest patch:

+ (write_ok ||
+ shared->page_status[slotno] != SLRU_PAGE_WRITE_IN_PROGRESS))

Do you think that with new locking protocol as proposed in this
patch, it is safe to return if page status is SLRU_PAGE_WRITE_IN_PROGRESS
even if write_ok is true?

I think the case where it can cause problem is during
SlruInternalWritePage()
where it performs below actions:
1. first marks the status of page as SLRU_PAGE_WRITE_IN_PROGRESS.
2. then take buffer lock in Exclusive mode.
3. release control lock.
4. perform the write
5. re-acquire the control lock in Exclusive mode.

Now consider another client which has to update the transaction status:
1. Control lock in Shared mode.
2. Get the slot
3. Acquire the buffer lock in Exclusive mode

Now consider client which has to update the transaction status performs
its step-1 after step-3 of writer, if that happens, then that could lead to
deadlock because writer will wait for client to release control lock and
client will wait for writer to release buffer lock.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
clog_optimize.v5.patch application/octet-stream 6.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2015-08-31 12:01:33 Re: Dependency between bgw_notify_pid and bgw_flags
Previous Message YUriy Zhuravlev 2015-08-31 11:09:00 Re: Scaling PostgreSQL at multicore Power8