Re: WIP: WAL prefetch (another approach)

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: WAL prefetch (another approach)
Date: 2020-09-08 23:16:27
Message-ID: 20200908231235.lsgat5ni35h6odqt@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Sep 05, 2020 at 12:05:52PM +1200, Thomas Munro wrote:
>On Wed, Sep 2, 2020 at 2:18 AM Tomas Vondra
><tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>> On Wed, Sep 02, 2020 at 02:05:10AM +1200, Thomas Munro wrote:
>> >On Wed, Sep 2, 2020 at 1:14 AM Tomas Vondra
>> ><tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>> >> from the archive
>> >
>> >Ahh, so perhaps that's the key.
>>
>> Maybe. For the record, the commands look like this:
>>
>> archive_command = 'gzip -1 -c %p > /mnt/raid/wal-archive/%f.gz'
>>
>> restore_command = 'gunzip -c /mnt/raid/wal-archive/%f.gz > %p.tmp && mv %p.tmp %p'
>
>Yeah, sorry, I goofed here by not considering archive recovery
>properly. I have special handling for crash recovery from files in
>pg_wal (XLRO_END, means read until you run out of files) and streaming
>replication (XLRO_WALRCV_WRITTEN, means read only as far as the wal
>receiver has advertised as written in shared memory), as a way to
>control the ultimate limit on how far ahead to read when
>maintenance_io_concurrency and max_recovery_prefetch_distance don't
>limit you first. But if you recover from a base backup with a WAL
>archive, it uses the XLRO_END policy which can run out of files just
>because a new file hasn't been restored yet, so it gives up
>prefetching too soon, as you're seeing. That doesn't cause any
>damage, but it stops doing anything useful because the prefetcher
>thinks its job is finished.
>
>It'd be possible to fix this somehow in the two-XLogReader design, but
>since I'm testing a new version that has a unified
>XLogReader-with-read-ahead I'm not going to try to do that. I've
>added a basebackup-with-archive recovery to my arsenal of test
>workloads to make sure I don't forget about archive recovery mode
>again, but I think it's actually harder to get this wrong in the new
>design. In the meantime, if you are still interested in studying the
>potential speed-up from WAL prefetching using the most recently shared
>two-XLogReader patch, you'll need to unpack all your archived WAL
>files into pg_wal manually beforehand.

OK, thanks for looking into this. I guess I'll wait for an updated patch
before testing this further. The storage has limited capacity so I'd
have to either reduce the amount of data/WAL or juggle with the WAL
segments somehow. Doesn't seem worth it.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2020-09-08 23:39:51 Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly
Previous Message Andres Freund 2020-09-08 22:58:05 Re: VACUUM (INTERRUPTIBLE)?