Re: WAL insert delay settings

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL insert delay settings
Date: 2019-02-20 23:46:09
Message-ID: 20190220234609.GC6197@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Tomas Vondra (tomas(dot)vondra(at)2ndquadrant(dot)com) wrote:
> On 2/20/19 10:43 PM, Stephen Frost wrote:
> > Just to share a few additional thoughts after pondering this for a
> > while, but the comment Andres made up-thread really struck a chord- we
> > don't necessairly want to throttle anything, what we'd really rather do
> > is *prioritize* things, whereby foreground work (regular queries and
> > such) have a higher priority than background/bulk work (VACUUM, REINDEX,
> > etc) but otherwise we use the system to its full capacity. We don't
> > actually want to throttle a VACUUM run any more than a CREATE INDEX, we
> > just don't want those to hurt the performance of regular queries that
> > are happening.
>
> I think you're forgetting the motivation of this very patch was to
> prevent replication lag caused by a command generating large amounts of
> WAL (like CREATE INDEX / ALTER TABLE etc.). That has almost nothing to
> do with prioritization or foreground/background split.
>
> I'm not arguing against ability to prioritize stuff, but I disagree it
> somehow replaces throttling.

Why is replication lag an issue though? I would contend it's an issue
because with sync replication, it makes foreground processes wait, and
with async replication, it makes the actions of foreground processes
show up late on the replicas.

If the actual WAL records for the foreground processes got priority and
were pushed out earlier than the background ones, that would eliminate
both of those issues with replication lag. Perhaps there's other issues
that replication lag cause but which aren't solved by prioritizing the
actual WAL records that you care about getting to the replicas faster,
but if so, I'd like to hear what those are.

> > The other thought I had was that doing things on a per-table basis, in
> > particular, isn't really addressing the resource question appropriately.
> > WAL is relatively straight-forward and independent of a resource from
> > the IO for the heap/indexes, so getting an idea from the admin of how
> > much capacity they have for WAL makes sense. When it comes to the
> > capacity for the heap/indexes, in terms of IO, that really goes to the
> > underlying storage system/channel, which would actually be a tablespace
> > in properly set up environments (imv anyway).
> >
> > Wrapping this up- it seems unlikely that we're going to get a
> > priority-based system in place any time particularly soon but I do think
> > it's worthy of some serious consideration and discussion about how we
> > might be able to get there. On the other hand, if we can provide a way
> > for the admin to say "these are my IO channels (per-tablespace values,
> > plus a value for WAL), here's what their capacity is, and here's how
> > much buffer for foreground work I want to have (again, per IO channel),
> > so, PG, please arrange to not use more than 'capacity-buffer' amount of
> > resources for background/bulk tasks (per IO channel)" then we can at
> > least help them address the issue that foreground tasks are being
> > stalled or delayed due to background/bulk work. This approach means
> > that they won't be utilizing the system to its full capacity, but
> > they'll know that and they'll know that it's because, for them, it's
> > more important that they have that low latency for foreground tasks.
>
> I think it's mostly orthogonal feature to throttling.

I'm... not sure that what I was getting at above really got across.

What I was saying above, in a nutshell, is that if we're going to
provide throttling then we should give users a way to configure the
throttling on a per-IO-channel basis, which means at the tablespace
level, plus an independent configuration option for WAL since we allow
that to be placed elsewhere too.

Ideally, the configuration parameter would be in the same units as the
actual resource is too- which would probably be IOPS+bandwidth, really.
Just doing it in terms of bandwidth ends up being a bit of a mismatch
as compared to reality, and would mean that users would have to tune it
down farther than they might otherwise and therefore give up that much
more in terms of system capability.

Thanks!

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-02-20 23:50:26 Re: Refactoring the checkpointer's fsync request queue
Previous Message Jim Finnerty 2019-02-20 23:44:49 NOT IN subquery optimization