|From:||Jeff Janes <jeff(dot)janes(at)gmail(dot)com>|
|To:||Andres Freund <andres(at)anarazel(dot)de>|
|Subject:||Re: subscription worker signalling wal writer too much|
|Views:||Raw Message | Whole Thread | Download mbox|
On Mon, Jul 3, 2017 at 8:26 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> On Sat, Jun 24, 2017 at 5:09 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>> On 2017-06-15 15:06:43 -0700, Jeff Janes wrote:
>> > Wouldn't it
>> > better, and much simpler, just to have reverted the first change once
>> > second one was done?
>> I think both can actually happen independently, no? It's pretty common
>> for the page lsn to be *older* than the corresponding commit. In fact
>> you'll always hit that case unless there's also concurrent writes also
>> touching said page.
> That is true, but I don't see any residual regression when going from
> pre-db76b1efbbab2441428 to 7975c5e0a992ae9a4-with-my-patch applied. So I
> think the hint bit issue only shows up when hitting the same page
> aggressively, in which case the LSN does get advanced quickly enough that
> db76b1efbba solves it. Perhaps a different test case could show that
> issue. I also didn't see any improvement from the original 4de82f7d7c50a81e,
> so maybe 8 CPUs just isn't enough to detect the problem with setting
> hint-buts with permanent tables with synchronous_commit=false.
I've refined my benchmarking, using a smaller scaling factor (8, same as my
CPU count) and running the tests for longer. The smaller size means there
are more intensive updates on individual pgbench_accounts pages, and the
longer run times allows be unset hint bits to build up for longer (usually
I like short test runs with a large number of randomized repetitions, but
that doesn't work well here). Since the scale factor is the same as the
number of CPUs, I've bumped the thread count so that it is more likely
someone will choose a non-contended value of pgbench_branches.bid to update
at any given moment.
pgbench -c16 -j16 -T1200 -M prepared -b tpcb-func --startup='set
This testing did show the importance of waking up the wal writer so that
hint bits can be set:
median of 14 runs reported, p-val on difference of means is 2e-22.
Where nowake is a hackish disabling of the wake up introduced in
forward ported to 9.6 branch. (I still wake it if is is doing the big
sleep, I just took out the part that wake it up even from small sleeps)
So my revised benchmark is capable of detecting this effect, even with only
When I move to next commit where we set hint bits as long as the page was
re-dirtied after, get these numbers:
The difference between these is not statistically different, nor is the
difference between them and cfafd8beadcd6f.
So I think the logic you introduced in db76b1efbbab24 captures all the gain
there is to be captured, and 4de82f7d7c50a8 can be reverted. The only
reason to wake up the wal writer is if it is taking the big sleep.
Of course, I can't prove a negative. There could always be some test case
which demonstrates that 4de82f7d7c50a8 still matters in spite of
So to be ultra-conservative, attached is a patch which should preserve all
wake-ups other than the ones known to be useless.
So 7975c5 causes a significant regression (10% regression, p-val 7e-16 on
difference of the means). It repeatedly wakes the wal-writer up to flush a
page which the wal-writer simply refuses to flush. This can't serve any
purpose. My patch still wakes it up to write the page if it has not been
written yet, and also still wakes it to flush a range of pages of
WalWriterFlushAfter is met. But it just doesn't wake it up to flush a page
which has already been written and will not get flushed.
I'd prefer the code-simplifying change of just not waking it up anymore
except for when it is doing the big sleep, but I can't reliably detect a
performance difference between that and the more conservative change posted
|Next Message||Jeff Janes||2017-08-26 22:53:49||pgbench: faster version of tpcb-like transaction|
|Previous Message||Rosser Schwarz||2017-08-26 19:40:24||Re: Patch: add --if-exists to pg_recvlogical|