Re: max_wal_size

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: p(dot)luzanov(at)postgrespro(dot)ru
Cc: pluzanov(at)postgrespro(dot)ru, pgsql-docs(at)lists(dot)postgresql(dot)org
Subject: Re: max_wal_size
Date: 2020-05-27 18:11:33
Message-ID: CAKFQuwYOGSRbcWnSEo7vp7JJktdH9rKJM0T8Q3B6J0QJcq==hA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs

On Wed, May 27, 2020 at 9:17 AM <p(dot)luzanov(at)postgrespro(dot)ru> wrote:

> David,
>
> > This setting is the indirect means to ensure that the WAL directory
> > doesn't get too large by forcing a checkpoint thus allowing the
> > corresponding WAL to be removed.
>
>
> This is a soft limit, ok.
> But the question is a little different.
>
> Suppose we have: version >= 11, no replication slots, archive_mode =
> off.
> Checkpoint_timeout is big enough, so checkpoints triggered only by
> max_wal_size (1GB).
> checkpoint_completion_target = 1.
>
> What size of WAL files will be generated between checkpoints?
> 1GB or 0.5GB?
>
> As I understand the description of max_wal_size(Maximum size to let the
> WAL grow to between automatic WAL checkpoints), the answer is 1GB.
> But it seems that the right answer is 0.5GB.
>
>
Given how long it took me to come up with the answer I'm not going to claim
the documentation shouldn't be improved...or that the following is even
correct...especially not having performed tests

I see where you are coming from better now - in your example the system
operates under the simultaneous constraints that the directory should not
take up more than X amount of space and also that it wants zero wait time
between the end of the last checkpoint and the start of the next one -
where the next one will start at the X amount mark. The (unstated) goal is
to minimize I/O throughput allocated to WAL. Thus it should write out half
of the maximum data in exactly the same amount of time that it takes for a
new half of the maximum data to accumulate. If it writes any slower it
will have to wait at the end.

For 0.5 you get 2/3rds consumption: ( n / ( 1 + 0.5 ) ) = n * 2/3 - though
my head is starting to hurt at the moment to fully explain the timing
pattern. Unlike the 1.0 case there is downtime where non-checkpoint
induced writing is performed and the rate is chosen, combined with that, so
that some time is left at the end of each cycle.

David J.

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message p.luzanov 2020-05-27 21:12:34 Re: max_wal_size
Previous Message p.luzanov 2020-05-27 16:17:23 Re: max_wal_size