Re: patch to allow disable of WAL recycling

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jerry Jelinek <jerry(dot)jelinek(at)joyent(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch to allow disable of WAL recycling
Date: 2019-03-29 00:09:46
Message-ID: CA+hUKGKxwXpEzExn=A+jx9iWxdvgbF_MzrVqOq--xM+_8A4VBg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 29, 2019 at 10:47 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> On Fri, Mar 29, 2019 at 8:59 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > On Tue, Mar 26, 2019 at 3:24 PM Jerry Jelinek <jerry(dot)jelinek(at)joyent(dot)com> wrote:
> > > The latest patch is rebased, builds clean, and passes some basic testing. Please let me know if there is anything else I could do on this.
> >
> > I agree with Thomas Munro's earlier critique of the documentation.
> > The documentation of the new parameters makes an assumption,
> > completely unsupported in my view, about when those parameters should
> > be set, yet at the same time gives almost no information about what
> > they actually do. I don't like that.
> >
> > The patch needs a visit from pgindent, too.
>
> I would like to fix these problems and commit the patch. First, I'm
> going to go and do some project-style tidying, write some proposed doc
> tweaks, and retest these switches on the machine where I saw
> beneficial effects from the patch before. I'll post a new version
> shortly to see if anyone has objections.

Here's a new version of the patch.

Last time I ran the test, I was using FreeBSD 11.2, but now I'm on
FreeBSD 12.0, and I suspect something changed about how it respects
the arc size sysctls causing it to behave very badly, so this time I
didn't change them from their defaults. Also the disks have changed
from 7200RPM drives to 5400RPM drives since last time. The machine
has 2 underpowered cores and 6GB of RAM. What can I say, it's a super
low end storage/backup box. What's interesting is that it does show
the reported problem. Actually I often test stuff relating to OS
caching on this box precisely because the IO sticks out so much.

Some OS set-up steps run as root:

zfs create zroot/tmp/test
zfs set mountpoint=/tmp/test zroot/tmp/test
zfs set compression=off zroot/tmp/test
zfs set recordsize=8192 zroot/tmp/test
chown tmunro:tmunro /tmp/test

Now as my regular user:

initdb -D /tmp/test
cat <<EOF >> /tmp/test/postgresql.conf
fsync=off
max_wal_size = 600MB
min_wal_size = 600MB
EOF

I started postgres -D /tmp/test and I set up pgbench:

pgbench -i -s 100 postgres

Then I ran each test as follows:

tar cvf /dev/null /tmp/test # make sure all data files are pre-warmed into arc
for i in 1 2 3 ; do
pgbench -M prepared -c 4 -j 4 -T 120 postgres
done

I did that with all 4 GUC permutations and got the following TPS numbers:

wal_recycle=off, wal_init_zero=off: 2668, 1873, 2166
wal_recycle=on, wal_init_zero=off: 1936, 1350, 1552
wal_recycle=off, wal_init_zero=on : 2213, 1360, 1539
wal_recycle=on, wal_init_zero=on : 1539, 1007, 1252

Finally, concious that 2 minutes isn't really enough, I did a 10
minute run with both settings on and both off, again with the tar
command first to try to give them the same initial conditions (really
someone should write a "drop-caches-now" patch for FreeBSD that
affects the page cage and the ZFS ARC, but I digress) and got:

wal_recycle=on, wal_init_zero=on : 1468
wal_recycle=off, wal_init_zero=off: 2046

I still don't know why exactly this happens, but it's clearly a real
phenomenon. As for why Tomas Vondra couldn't see it, I'm guessing
that stacks more RAM and ~500k IOPS help a lot (essentially the
opposite end of the memory, CPU, IO spectrum from this little
machine), and Joyent's systems may be somewhere in between?

--
Thomas Munro
https://enterprisedb.com

Attachment Content-Type Size
0001-Add-wal_recycle-and-wal_init_zero-GUCs.patch application/octet-stream 10.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jamison, Kirk 2019-03-29 00:32:13 RE: Timeout parameters
Previous Message Chapman Flack 2019-03-28 23:45:24 Re: Fix XML handling with DOCTYPE