Re: subscription worker signalling wal writer too much

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: subscription worker signalling wal writer too much
Date: 2017-08-26 21:45:20
Message-ID: CAMkU=1zP931=v-ypFZMDxHAGGQwuVNUPONZE5_ggV+onkXEdvQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 3, 2017 at 8:26 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:

> On Sat, Jun 24, 2017 at 5:09 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>
>> On 2017-06-15 15:06:43 -0700, Jeff Janes wrote:
>>
>>
>>
>
>> > Wouldn't it
>> > better, and much simpler, just to have reverted the first change once
>> the
>> > second one was done?
>>
>> I think both can actually happen independently, no? It's pretty common
>> for the page lsn to be *older* than the corresponding commit. In fact
>> you'll always hit that case unless there's also concurrent writes also
>> touching said page.
>>
>
> That is true, but I don't see any residual regression when going from
> pre-db76b1efbbab2441428 to 7975c5e0a992ae9a4-with-my-patch applied. So I
> think the hint bit issue only shows up when hitting the same page
> aggressively, in which case the LSN does get advanced quickly enough that
> db76b1efbba solves it. Perhaps a different test case could show that
> issue. I also didn't see any improvement from the original 4de82f7d7c50a81e,
> so maybe 8 CPUs just isn't enough to detect the problem with setting
> hint-buts with permanent tables with synchronous_commit=false.
>

I've refined my benchmarking, using a smaller scaling factor (8, same as my
CPU count) and running the tests for longer. The smaller size means there
are more intensive updates on individual pgbench_accounts pages, and the
longer run times allows be unset hint bits to build up for longer (usually
I like short test runs with a large number of randomized repetitions, but
that doesn't work well here). Since the scale factor is the same as the
number of CPUs, I've bumped the thread count so that it is more likely
someone will choose a non-contended value of pgbench_branches.bid to update
at any given moment.

pgbench -c16 -j16 -T1200 -M prepared -b tpcb-func --startup='set
synchronous_commit=false'

This testing did show the importance of waking up the wal writer so that
hint bits can be set:

commit TPS
cfafd8beadcd6f 22200.48
cfafd8beadcd6f_nowake 30766.83
median of 14 runs reported, p-val on difference of means is 2e-22.

Where nowake is a hackish disabling of the wake up introduced in
4de82f7d7c50a8,
forward ported to 9.6 branch. (I still wake it if is is doing the big
sleep, I just took out the part that wake it up even from small sleeps)

So my revised benchmark is capable of detecting this effect, even with only
8 CPUs.

When I move to next commit where we set hint bits as long as the page was
re-dirtied after, get these numbers:

db76b1efbbab24 30742.69
db76b1efbbab24_nowake 31072.97

The difference between these is not statistically different, nor is the
difference between them and cfafd8beadcd6f.

So I think the logic you introduced in db76b1efbbab24 captures all the gain
there is to be captured, and 4de82f7d7c50a8 can be reverted. The only
reason to wake up the wal writer is if it is taking the big sleep.

Of course, I can't prove a negative. There could always be some test case
which demonstrates that 4de82f7d7c50a8 still matters in spite of
db76b1efbbab24.
So to be ultra-conservative, attached is a patch which should preserve all
wake-ups other than the ones known to be useless.

7975c5e0a 27810.66
7975c5e0a_patched 30821.41

So 7975c5 causes a significant regression (10% regression, p-val 7e-16 on
difference of the means). It repeatedly wakes the wal-writer up to flush a
page which the wal-writer simply refuses to flush. This can't serve any
purpose. My patch still wakes it up to write the page if it has not been
written yet, and also still wakes it to flush a range of pages of
WalWriterFlushAfter is met. But it just doesn't wake it up to flush a page
which has already been written and will not get flushed.

I'd prefer the code-simplifying change of just not waking it up anymore
except for when it is doing the big sleep, but I can't reliably detect a
performance difference between that and the more conservative change posted
here.

Cheers,

Jeff

Attachment Content-Type Size
change_walwriter_wakeup_v2.patch application/octet-stream 1013 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Janes 2017-08-26 22:53:49 pgbench: faster version of tpcb-like transaction
Previous Message Rosser Schwarz 2017-08-26 19:40:24 Re: Patch: add --if-exists to pg_recvlogical