Re: WAL still kept in pg_xlog even long after heavy workload is done

From: MichaelDBA <MichaelDBA(at)sqlexec(dot)com>
To: Stefan Petrea <Stefan(dot)Petrea(at)tangoe(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: WAL still kept in pg_xlog even long after heavy workload is done
Date: 2017-10-27 12:50:01
Message-ID: 59F32B79.2070307@sqlexec.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

To get the values right, you have to consider the "unit" column in
pg_settings. On mine, it is 16M for both min and max wal size. So it
would be
1024 x 1024 x 16 x <value> (pg_settings.min_wal_size or
pg_settings.max_wal_size)

The result of this formula should be close to what you specified in
postgresql.conf.

So modify your SQL a bit:
psql -c "select name, setting, unit from pg_settings where name like
'%wal_size';"

Regards,
Michael Vitale

> Stefan Petrea <mailto:Stefan(dot)Petrea(at)tangoe(dot)com>
> Friday, October 27, 2017 7:28 AM
> Hello,
>
> We're encountering some problems with WAL growth in production with
> PostgreSQL 9.6.3 and 9.6.2. From what I know a WAL file can either be
> recycled(and would be reused) or deleted.
> We'd like to have better control over the amount of WAL that is kept
> around.
> There were a few occasions where we had to resize partitions because
> pg_xlog grew as much as it did.
>
> According to the docs [1] there are some parameters in GUC
> (postgresql.conf) about this.
> The parameters I've been able to identify are the following:
>
> * wal_keep_segments
> * max_wal_size
> * min_wal_size
>
> Our WAL grows a lot around the time of our product upgrades (that is,
> when we upgrade to a new version of our database, so not a Pg upgrade,
> just a newer version of our db schema, plpgsql code etc).
> As part of this upgrade, we add new columns or have some large UPDATEs
> on tables as big as 300M (but in one case we also have one with 1.5B
> rows).
>
> I am seeing the following int he docs [3]
>
> min_wal_size (integer)
> As long as WAL disk usage stays below this setting, old WAL files are
> always recycled for future use at a checkpoint, rather than removed.
> This can be used to ensure that enough WAL space is reserved to handle
> spikes in WAL usage, for example when running large batch jobs. The
> default
> is 80 MB. This parameter can only be set in the postgresql.conf file or
> on the server command line.
>
> This sounds very familiar because, that's essentially what we're
> doing. There
> are some large jobs that cause a lot of workload and changes and
> generate a lot of WAL.
>
> So far, the way I interpret this is min_wal_size is the amount of WAL
> recycled (that is kept around to be reused) and max_wal_size is the
> total amount of WAL allowed to be kept on disk.
>
> I would also like to interpret the default values of min_wal_size and
> max_wal_size.
> So if I run the following query:
>
> psql -c "select name, setting from pg_settings where name like
> '%wal_size';"
>
> I get the following:
>
> max_wal_size|2097152
> min_wal_size|1048576
>
> Do these two values look ok?
>
> Both these values were generated by pgtune [4], but it seems like pgtune
> thinks they're expressed by default in KB.
> Looking at the PostgreSQL code, it seems to me that these two are
> expressed in MB, at least that's what I understand when I see
> GUC_UNIT_MB in the source code [6].
>
> So maybe the pgtune fork we're using has a bug in the sense that it
> produces an incorrect value for those two parameters? (should be in MB
> but is expressed in KB, therefore much higher than what it should be).
>
> Another question is, how can I use any of the checkpoint settings
> to control the WAL that is kept around?
>
> * checkpoint_timeout
> * checkpoint_completion_target
> * checkpoint_flush_after
> * checkpoint_warning
>
> =========
>
> I actually tried something with these settings on a test environment.
> I've used the following settings:
>
> checkpoint_timeout = 40s
> min_wal_size = 600MB
> max_wal_size = 900MB
>
> Then I've created a db named x1 and ran this on it four or five times.
>
> pgbench -i -s 70 x1
>
> The pg_xlog directory grew to 2.2G and after a few minutes, it
> decreased to 2.0G
> After about 40 minutes it decreased to 1.4G and it's not going any lower.
> I was expecting pg_xlog's size to be 600MB after the first WAL removal
> had run.
> Should I expect that the size will eventually drop to 600MB or will it
> just sit there at 1.4G?
>
> =========
>
> Other thoughts:
>
> I have looked a bit at Pg internals too, I'm seeing four functions
> there that are responsible for removing WAL: XLogArchiveIsReady,
> RemoveXlogFile, RemoveOldXlogFiles, XLOGfileslop.
> All of these belong to /src/backend/access/transam/xlog.c
>
> The only place in the code that seems to take a decision about how much
> WAL to recycle and how much to remove is the function XLOGfileslop [2].
>
> It seems like XLOGfileslop is an estimate for the number of WAL to keep
> around(recycled WAL). Both max_wal_size and min_wal_size are used inside
> XLOGfileslop.
>
> As far as checkpoint_* GUC settings go, they seem to be involved as well.
> So far, the only thing I know about checkpoints is that between
> checkpoints, many WAL are created. The amount of WAL between checkpoints
> can vary. I don't have a good understanding about the interplay between
> checkpoints and WAL.
>
>
> I'd be grateful for any thoughts on how to improve this, and better
> control
> the amount of WAL kept in pg_xlog.
>
> Thank you,
> Stefan
>
> [1] https://www.postgresql.org/docs/9.6/static/wal-configuration.html
> [2]
> https://github.com/postgres/postgres/blob/0c5803b450e0cc29b3527df3f352e6f18a038cc6/src/backend/access/transam/xlog.c#L2258
> [3]
> https://www.postgresql.org/docs/9.6/static/runtime-config-wal.html#RUNTIME-CONFIG-WAL-CHECKPOINTS
> [4] https://github.com/kmatt/pgtune
> [5] https://github.com/kmatt/pgtune/blob/master/pgtune#L560
> [6]
> https://github.com/postgres/postgres/blob/f49842d1ee31b976c681322f76025d7732e860f3/src/backend/utils/misc/guc.c#L2268
>
>
> Stefan Petrea
> System Engineer
>
> stefan(dot)petrea(at)tangoe(dot)com
>
>
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Benjamin Coutu 2017-10-29 11:24:20 Cheaper subquery scan not considered unless offset 0
Previous Message Stefan Petrea 2017-10-27 11:28:15 WAL still kept in pg_xlog even long after heavy workload is done