Re: [HACKERS] WAL logging problem in 9.4.3?

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: noah(at)leadboat(dot)com
Cc: robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, 9erthalion6(at)gmail(dot)com, andrew(dot)dunstan(at)2ndquadrant(dot)com, hlinnaka(at)iki(dot)fi, michael(at)paquier(dot)xyz
Subject: Re: [HACKERS] WAL logging problem in 9.4.3?
Date: 2019-11-20 06:05:46
Message-ID: 20191120.150546.1050157217400213784.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I'm in the benchmarking week..

Thanks for reviewing!.

At Sun, 17 Nov 2019 20:54:34 -0800, Noah Misch <noah(at)leadboat(dot)com> wrote in
> On Tue, Nov 05, 2019 at 02:53:35PM -0800, Noah Misch wrote:
> > I started pre-commit editing on 2019-10-28, and comment+README updates have
> > been the largest part of that. I'll check my edits against the things you
> > list here, and I'll share on-list before committing. I've now marked the CF
> > entry Ready for Committer.

I'll look into that soon.

By the way, before finalize this, I'd like to share the result of a
brief benchmarking.

First, I measured the direct effect of WAL skipping.
I measured the time required to do the following sequence for the
COMMIT-FPW-WAL case and COMMIT-fsync case. WAL and heap files are on
non-server spec HDD.

BEGIN;
TRUNCATE t;
INSERT INTO t (SELECT a FROM generate_series(1, n) a);
COMMIT;

REPLICA means the time with wal_level = replica
SYNC means the time with wal_level = minimal and force file sync.
WAL means the time with wal_level = minimal and force commit-WAL.
pages is the number of pages of the table.
(REPLICA comes from run.sh 1, SYNC/WAL comes from run.sh 2)

pages REPLICA SYNC WAL
1: 144 ms 683 ms 217 ms
3: 303 ms 995 ms 385 ms
5: 271 ms 1007 ms 217 ms
10: 157 ms 1043 ms 224 ms
17: 189 ms 1007 ms 193 ms
31: 202 ms 1091 ms 230 ms
56: 265 ms 1175 ms 226 ms
100: 510 ms 1307 ms 270 ms
177: 790 ms 1523 ms 524 ms
316: 1827 ms 1643 ms 719 ms
562: 1904 ms 2109 ms 1148 ms
1000: 3060 ms 2979 ms 2113 ms
1778: 6077 ms 3945 ms 3618 ms
3162: 13038 ms 7078 ms 6734 ms

There was a crossing point around 3000 pages. (bench1() finds that by
bisecting, run.sh 3).

With multiple sessions, the crossing point but does not go so
small.

10 processes (run.pl 4 10) The numbers in parentheses are WAL[n]/WAL[n-1].
pages SYNC WAL
316: 8436 ms 4694 ms
562: 12067 ms 9627 ms (x2.1) # WAL wins
1000: 19154 ms 43262 ms (x4.5) # SYNC wins. WAL's slope becomes steep.
1778: 32495 ms 63863 ms (x1.4)

100 processes (run.pl 4 100)
pages SYNC WAL
10: 13275 ms 1868 ms
17: 15919 ms 4438 ms (x2.3)
31: 17063 ms 6431 ms (x1.5)
56: 23193 ms 14276 ms (x2.2) # WAL wins
100: 35220 ms 67843 ms (x4.8) # SYNC wins. WAL's slope becomes steep.

With 10 pgbench sessions.
pages SYNC WAL
1: 915 ms 301 ms
3: 1634 ms 508 ms
5: 1634 ms 293ms
10: 1671 ms 1043 ms
17: 1600 ms 333 ms
31: 1864 ms 314 ms
56: 1562 ms 448 ms
100: 1538 ms 394 ms
177: 1697 ms 1047 ms
316: 3074 ms 1788 ms
562: 3306 ms 1245 ms
1000: 3440 ms 2182 ms
1778: 5064 ms 6464 ms # WAL's slope becomes steep
3162: 8675 ms 8165 ms

I don't think the result of 100 processes is meaningful, so excluding
the result a candidate for wal_skip_threshold can be 1000.

Thoughts? The attached is the benchmark script.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message btkimurayuzk 2019-11-20 07:16:45 Re: pg_waldump and PREPARE
Previous Message Paul A Jungwirth 2019-11-20 05:49:06 Re: range_agg