Re: WIP: WAL prefetch (another approach)

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Steele <david(at)pgmasters(dot)net>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Jakub Wartak <Jakub(dot)Wartak(at)tomtom(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: WAL prefetch (another approach)
Date: 2021-04-29 00:12:24
Message-ID: 20210429001224.cdkef4ybkw7gramv@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2021-04-28 19:24:53 -0400, Tom Lane wrote:
> But I happened to notice the accumulated CPU time for the background
> processes:
>
> USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
> tgl 19048 0.0 4.4 229952 92196 ?? Ss 3:19PM 19:59.19 postgres: startup recovering 000000010000001400000022
> tgl 19051 0.0 0.1 229656 1696 ?? Ss 3:19PM 27:09.14 postgres: walreceiver streaming 14/227D8F14
> tgl 19052 0.0 0.1 229904 2516 ?? Ss 3:19PM 17:38.17 postgres: walsender tgl [local] streaming 14/227D8F14
>
> IOW, we've spent over twice as many CPU cycles shipping data to the
> standby as we did in applying the WAL on the standby. Is this
> expected? I've got wal_consistency_checking = all, which is bloating
> the WAL volume quite a bit, but still it seems like the walsender and
> walreceiver have little excuse for spending more cycles per byte
> than the startup process.

I don't really know how the time calculation works on mac. Is there a
chance it includes time spent doing IO? On the primary the WAL IO is
done by a lot of backends, but on the standby it's all going to be the
walreceiver. And the walreceiver does fsyncs in a not particularly
efficient manner.

FWIW, on my linux workstation no such difference is visible:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
andres 2910540 9.4 0.0 2237252 126680 ? Ss 16:55 0:20 postgres: dev assert standby: startup recovering 00000001000000020000003F
andres 2910544 5.2 0.0 2236724 9260 ? Ss 16:55 0:11 postgres: dev assert standby: walreceiver streaming 2/3FDCF118
andres 2910545 2.1 0.0 2237036 10672 ? Ss 16:55 0:04 postgres: dev assert: walsender andres [local] streaming 2/3FDCF118

> (This is testing b3ee4c503, so if Thomas' WAL changes improved
> efficiency of the replay process at all, the discrepancy could be
> even worse in HEAD.)

The prefetching isn't enabled by default, so I'd not expect meaningful
differences... And even with the prefetching enabled, our normal
regression tests largely are resident in s_b, so there shouldn't be much
prefetching.

Oh! I was about to ask how much shared buffers your primary / standby
have. And I think I may actually have reproduce a variant of the issue!

I previously had played around with different settings that I thought
might increase the likelihood of reproducing the problem. But this time
I set shared_buffers lower than before, and got:

2021-04-28 17:03:22.174 PDT [2913840][] LOG: database system was shut down in recovery at 2021-04-28 17:03:11 PDT
2021-04-28 17:03:22.174 PDT [2913840][] LOG: entering standby mode
2021-04-28 17:03:22.178 PDT [2913840][1/0] LOG: redo starts at 2/416C6278
2021-04-28 17:03:37.628 PDT [2913840][1/0] LOG: consistent recovery state reached at 4/7F5C3200
2021-04-28 17:03:37.628 PDT [2913840][1/0] FATAL: invalid memory alloc request size 3053455757
2021-04-28 17:03:37.628 PDT [2913839][] LOG: database system is ready to accept read only connections
2021-04-28 17:03:37.636 PDT [2913839][] LOG: startup process (PID 2913840) exited with exit code 1

This reproduces across restarts. Yay, I guess.

Isn't it off that we get a "database system is ready to accept read only
connections"?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-04-29 00:24:43 Re: WIP: WAL prefetch (another approach)
Previous Message David Rowley 2021-04-28 23:51:07 Re: Use simplehash.h instead of dynahash in SMgr