Re: patch to allow disable of WAL recycling

From: Jerry Jelinek <jerry(dot)jelinek(at)joyent(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: patch to allow disable of WAL recycling
Date: 2018-07-10 20:15:30
Message-ID: CACPQ5Fr7W19BHV+0Qn1RLHE1UZe1T5HzbAvxRRU8+J3BmCSGEg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks to everyone who took the time to look at the patch and send me
feedback. I'm happy to work on improving the documentation of this new
tunable to clarify when it should be used and the implications. I'm trying
to understand more specifically what else needs to be done next. To
summarize, I think the following general concerns were brought up.

1) Disabling WAL recycling could have a negative performance impact on a
COW filesystem if all WAL files could be kept in the filesystem cache.
2) Disabling WAL recycling reduces reliability, even on COW filesystems.
3) Using something like posix_fadvise to reload recycled WAL files into the
filesystem cache is better even for a COW filesystem.
4) There are "several" other purposes for WAL recycling which this tunable
would impact.
5) A WAL recycling tunable is too specific and a more general solution is
needed.
6) Need more performance data.

For #1, #2 and #3, I don't understand these concerns. It would be helpful
if these could be more specific

For #4, can anybody enumerate these other purposes for WAL recycling?

For #5, perhaps I am making an incorrect assumption about what the original
response was requesting, but I understand that WAL recycling is just one
aspect of WAL file creation/allocation. However, the creation of a new WAL
file is not a problem we've ever observed. In general, any modern
filesystem should do a good job of caching recently accessed files. We've
never observed a problem with the allocation of a new WAL file slightly
before it is needed. The problem we have observed is specifically around
WAL file recycling when we have to access old files that are long gone from
the filesystem cache. The semantics around recycling seem pretty crisp as
compared to some other tunable which would completely change how WAL files
are created. Given that a change like that is also much more intrusive, it
seems better to provide a tunable to disable WAL recycling vs. some other
kind of tunable for which we can't articulate any improvement except in the
recycling scenario.

For #6, there is no feasible way for us to recreate our workload on other
operating systems or filesystems. Can anyone expand on what performance
data is needed?

I'd like to restate the original problem we observed.

When PostgreSQL decides to reuse an old WAL file whose contents have been
evicted from the cache (because they haven't been used in hours), this
turns what should be a workload bottlenecked by synchronous write
performance (that can be well-optimized with an SSD log device) into a
random read workload (that's much more expensive for any system). What's
significantly worse is that we saw this on synchronous standbys. When that
happened, the WAL receiver was blocked on a random read from disk, and
since it's single-threaded, all write queries on the primary stop until the
random read finishes. This is particularly bad for us when the sync is
doing other I/O (e.g., for an autovacuum or a database backup) that causes
disk reads to take hundreds of milliseconds.

To summarize, recycling old WAL files seems like an optimization designed
for certain filesystems that allocate disk blocks up front. Given that the
existing behavior is already filesystem specific, is there specific reasons
why we can't provide a tunable to disable this behavior for filesystems
which don't behave that way?

Thanks again,
Jerry

On Tue, Jun 26, 2018 at 7:35 AM, Jerry Jelinek <jerry(dot)jelinek(at)joyent(dot)com>
wrote:

> Hello All,
>
> Attached is a patch to provide an option to disable WAL recycling. We have
> found that this can help performance by eliminating read-modify-write
> behavior on old WAL files that are no longer resident in the filesystem
> cache. The is a lot more detail on the background of the motivation for
> this in the following thread.
>
> https://www.postgresql.org/message-id/flat/CACukRjO7DJvub8e2AijOayj8BfKK3
> XXBTwu3KKARiTr67M3E3w%40mail.gmail.com#CACukRjO7DJvub8e2AijOayj8BfKK3
> XXBTwu3KKARiTr67M3E3w(at)mail(dot)gmail(dot)com
>
> A similar change has been tested against our 9.6 branch that we're
> currently running, but the attached patch is against master.
>
> Thanks,
> Jerry
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-07-10 20:23:42 Re: _isnan() on Windows
Previous Message Heikki Linnakangas 2018-07-10 20:07:44 Re: GiST VACUUM