Re: Speed up Clog Access by increasing CLOG buffers

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Thom Brown <thom(at)linux(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Speed up Clog Access by increasing CLOG buffers
Date: 2016-02-12 05:55:45
Message-ID: CAA4eK1JwMRcpDMYE72dAByVTM0Z95zaU-1-UiqW5X+-wwopd7A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 11, 2016 at 8:02 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On the substantive part of the patch, this doesn't look safe:
>
> + /*
> + * Add ourselves to the list of processes needing a group XID status
> + * update.
> + */
> + proc->clogGroupMember = true;
> + proc->clogGroupMemberXid = xid;
> + proc->clogGroupMemberXidStatus = status;
> + proc->clogGroupMemberPage = pageno;
> + proc->clogGroupMemberLsn = lsn;
> + while (true)
> + {
> + nextidx = pg_atomic_read_u32(&procglobal->clogGroupFirst);
> +
> + /*
> + * Add the proc to list if the clog page where we need to update
the
> + * current transaction status is same as group leader's clog
page.
> + */
> + if (nextidx != INVALID_PGPROCNO &&
> + ProcGlobal->allProcs[nextidx].clogGroupMemberPage !=
> proc->clogGroupMemberPage)
> + return false;
>
> DANGER HERE!
>
> + pg_atomic_write_u32(&proc->clogGroupNext, nextidx);
> +
> + if (pg_atomic_compare_exchange_u32(&procglobal->clogGroupFirst,
> + &nextidx,
> + (uint32) proc->pgprocno))
> + break;
> + }
>
> There is a potential ABA problem here. Suppose that this code
> executes in one process as far as the line that says DANGER HERE.
> Then, the group leader wakes up, performs all of the CLOG
> modifications, performs another write transaction, and again becomes
> the group leader, but for a different member page. Then, the original
> process that went to sleep at DANGER HERE wakes up. At this point,
> the pg_atomic_compare_exchange_u32 will succeed and we'll have
> processes with different pages in the list, contrary to the intention
> of the code.
>

Very Good Catch. I think if we want to address this we can detect
the non-group leader transactions that tries to update the different
CLOG page (different from group-leader) after acquiring
CLogControlLock and then mark these transactions such that
after waking they need to perform CLOG update via normal path.
Now this can decrease the latency of such transactions, but I
think there will be only very few transactions if at-all there which
can face this condition, because most of the concurrent transactions
should be on same page, otherwise the idea of multiple-slots we
have tried upthread would have shown benefits.
Another idea could be that we update the comments indicating the
possibility of multiple Clog-page updates in same group on the basis
that such cases will be less and even if it happens, it won't effect the
transaction status update.

Do you have anything else in mind?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2016-02-12 06:25:15 Re: custom function for converting human readable sizes to bytes
Previous Message Tom Lane 2016-02-12 04:34:14 Re: Re: [COMMITTERS] pgsql: Add some isolation tests for deadlock detection and resolution.