Re: WAL insert delay settings

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Stephen Frost <sfrost(at)snowman(dot)net>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL insert delay settings
Date: 2019-02-19 20:56:18
Message-ID: CAKJS1f8KNB2SChDgP_CoSHA2D=SVd839Ov_NyKLo-gbvsdLrDQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 20 Feb 2019 at 07:28, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Or maybe we should just blow up the current vacuum cost delay stuff
> and replace it with something that is easier to tune. For example, we
> could just have one parameter that sets the maximum read rate in kB/s
> and another that sets the maximum dirty-page rate in kB/s. Whichever
> limit is tighter binds. If we also have the thing that is the topic
> of this thread, that's a third possible upper limit.

I had similar thoughts when I saw that Peter's proposal didn't seem
all that compatible with how the vacuum cost delays work today. I
agree the cost limit would have to turn into something time based
rather than points based.

To me, it seems just too crude to have a per-backend limit. I think
global "soft" limits would be better. Let's say, for example, the DBA
would like to CREATE INDEX CONCURRENTLY on a 6TB table. They think
this is going to take about 36 hours, so they start the operation at
the start of off-peak, which is expected to last 12 hours. This means
the create index is going to run for 2 off-peaks and 1 on-peak. Must
they really configure the create index to run at a speed that is
suitable for running at peak-load? That's pretty wasteful as surely
it could run much more quickly during the off-peak.

I know there's debate as to if this can rate limit WAL, but, if we can
find a way to do that, then it seems to me some settings like:

max_soft_global_wal_rate (MB/sec)
min_hard_local_wal_rate (MB/sec)

That way the rate limited process would slow down to
min_hard_local_wal_rate when the WAL rate of all processes is
exceeding max_soft_global_wal_rate. The min_hard_local_wal_rate is
just there to ensure the process never stops completely. It can simply
tick along at that rate until the global WAL rate slows down again.

It's likely going to be easier to do something like that in WAL than
with buffer read/write/dirties since we already can easily see how
much WAL has been written by looking at the current LSN.

(Of course, these GUC names are not very good. I just picked them out
the air quickly to try and get their meaning across)

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2019-02-19 20:57:19 Re: BUG #15572: Misleading message reported by "Drop function operation" on DB with functions having same name
Previous Message Robert Haas 2019-02-19 20:43:58 Re: propagating replica identity to partitions