Re: Use fadvise in wal replay

From: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jakub Wartak <Jakub(dot)Wartak(at)tomtom(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Kirill Reshke <reshke(at)double(dot)cloud>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Use fadvise in wal replay
Date: 2022-08-07 16:41:16
Message-ID: 96C1BADB-362B-4C02-8799-0118B44C2025@yandex-team.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 7 Aug 2022, at 06:39, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> Agree. Why can't we just prefetch the entire WAL file once whenever it
> is opened for the first time? Does the OS have any limitations on max
> size to prefetch at once? It may sound aggressive, but it avoids
> fadvise() system calls, this will be especially useful if there are
> many WAL files to recover (crash, PITR or standby recovery),
> eventually we would want the total WAL file to be prefetched.
>
> If prefetching the entire WAL file is okay, we could further do this:
> 1) prefetch in XLogFileOpen() and all of segment_open callbacks, 2)
> release in XLogFileClose (it's being dong right now) and all of
> segment_close callbacks - do this perhaps optionally.
>
> Also, can't we use an existing function FilePrefetch()? That way,
> there is no need for a new wait event type.
>
> Thoughts?

Thomas expressed this idea upthread. Benchmarks done by Jakub showed that this approach had no significant improvement over existing master code.
The same benchmarks showed almost x1.5 improvement of readahead in 8Kb or 128Kb chunks.

Best regards, Andrey Borodin.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-08-07 16:47:01 Re: bug on log generation ?
Previous Message Bharath Rupireddy 2022-08-07 15:52:39 Re: Use pg_pwritev_with_retry() instead of write() in dir_open_for_write() to avoid partial writes?