Re: Speed up Clog Access by increasing CLOG buffers

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Thom Brown <thom(at)linux(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Speed up Clog Access by increasing CLOG buffers
Date: 2016-02-11 14:32:52
Message-ID: CA+Tgmoa-OdD3CAAd5v8RwxYay-PvGVBwu8pcn4SBrEH_qsyYdw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 11, 2016 at 9:04 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> Is there anything, I can do to move this forward?
>
> Well, looking at this again, I think I'm OK to go with your names.
> That doesn't seem like the thing to hold up the patch for. So I'll go
> ahead and push the renaming patch now.

On the substantive part of the patch, this doesn't look safe:

+ /*
+ * Add ourselves to the list of processes needing a group XID status
+ * update.
+ */
+ proc->clogGroupMember = true;
+ proc->clogGroupMemberXid = xid;
+ proc->clogGroupMemberXidStatus = status;
+ proc->clogGroupMemberPage = pageno;
+ proc->clogGroupMemberLsn = lsn;
+ while (true)
+ {
+ nextidx = pg_atomic_read_u32(&procglobal->clogGroupFirst);
+
+ /*
+ * Add the proc to list if the clog page where we need to update the
+ * current transaction status is same as group leader's clog page.
+ */
+ if (nextidx != INVALID_PGPROCNO &&
+ ProcGlobal->allProcs[nextidx].clogGroupMemberPage !=
proc->clogGroupMemberPage)
+ return false;

DANGER HERE!

+ pg_atomic_write_u32(&proc->clogGroupNext, nextidx);
+
+ if (pg_atomic_compare_exchange_u32(&procglobal->clogGroupFirst,
+ &nextidx,
+ (uint32) proc->pgprocno))
+ break;
+ }

There is a potential ABA problem here. Suppose that this code
executes in one process as far as the line that says DANGER HERE.
Then, the group leader wakes up, performs all of the CLOG
modifications, performs another write transaction, and again becomes
the group leader, but for a different member page. Then, the original
process that went to sleep at DANGER HERE wakes up. At this point,
the pg_atomic_compare_exchange_u32 will succeed and we'll have
processes with different pages in the list, contrary to the intention
of the code.

This kind of thing can be really subtle and difficult to fix. The
problem might not happen even with a very large amount of testing, and
then might happen in the real world on some other hardware or on
really rare occasions. In general, compare-and-swap loops need to be
really really simple with minimal dependencies on other data, ideally
none. It's very hard to make anything else work.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-02-11 14:36:00 Re: [COMMITTERS] pgsql: Add some isolation tests for deadlock detection and resolution.
Previous Message Tom Lane 2016-02-11 14:29:56 Re: pgsql: Add some isolation tests for deadlock detection and resolution.