Re: WAL insert delay settings

From: Ants Aasma <ants(dot)aasma(at)eesti(dot)ee>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL insert delay settings
Date: 2019-02-21 09:06:44
Message-ID: CA+CSw_t=_yZvJ4m7+O8Mcp2z8FC5q_O3On=TMPeAUO7Mf2XEbg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 21, 2019 at 2:20 AM Stephen Frost <sfrost(at)snowman(dot)net> wrote:

> * Andres Freund (andres(at)anarazel(dot)de) wrote:
> > On 2019-02-20 18:46:09 -0500, Stephen Frost wrote:
> > > * Tomas Vondra (tomas(dot)vondra(at)2ndquadrant(dot)com) wrote:
> > > > On 2/20/19 10:43 PM, Stephen Frost wrote:
> > > > > Just to share a few additional thoughts after pondering this for a
> > > > > while, but the comment Andres made up-thread really struck a
> chord- we
> > > > > don't necessairly want to throttle anything, what we'd really
> rather do
> > > > > is *prioritize* things, whereby foreground work (regular queries
> and
> > > > > such) have a higher priority than background/bulk work (VACUUM,
> REINDEX,
> > > > > etc) but otherwise we use the system to its full capacity. We
> don't
> > > > > actually want to throttle a VACUUM run any more than a CREATE
> INDEX, we
> > > > > just don't want those to hurt the performance of regular queries
> that
> > > > > are happening.
> > > >
> > > > I think you're forgetting the motivation of this very patch was to
> > > > prevent replication lag caused by a command generating large amounts
> of
> > > > WAL (like CREATE INDEX / ALTER TABLE etc.). That has almost nothing
> to
> > > > do with prioritization or foreground/background split.
> > > >
> > > > I'm not arguing against ability to prioritize stuff, but I disagree
> it
> > > > somehow replaces throttling.
> > >
> > > Why is replication lag an issue though? I would contend it's an issue
> > > because with sync replication, it makes foreground processes wait, and
> > > with async replication, it makes the actions of foreground processes
> > > show up late on the replicas.
> >
> > I think reaching the bandwidth limit of either the replication stream,
> > or of the startup process is actually more common than these. And for
> > that prioritization doesn't help, unless it somehow reduces the total
> > amount of WAL.
>
> The issue with hitting those bandwidth limits is that you end up with
> queues outside of your control and therefore are unable to prioritize
> the data going through them. I agree, that's an issue and it might be
> necessary to ask the admin to provide what the bandwidth limit is, so
> that we could then avoid running into issues with downstream queues that
> are outside of our control causing unexpected/unacceptable lag.
>

If there is a global rate limit on WAL throughput it could be adjusted by a
control loop, measuring replication queue length and/or apply delay. I
don't see any sane way how one would tune a per command rate limit, or even
worse, a cost-delay parameter. It would have the same problems as work_mem
settings.

Rate limit in front of WAL insertion would allow for allocating the
throughput between foreground and background tasks, and even allow for
priority inheritance to alleviate priority inversion due to locks.

There is also an implicit assumption here that a maintenance command is a
background task and a normal DML query is a foreground task. This is not
true for all cases, users may want to throttle transactions doing lots of
DML to keep synchronous commit latencies for smaller transactions within
reasonable limits.

As a wild idea for how to handle the throttling, what if when all our wal
insertion credits are used up XLogInsert() sets InterruptPending and the
actual sleep is done inside ProcessInterrupts()?

Regards,
Ants Aasma

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Antonin Houska 2019-02-21 09:08:20 Inappropriate scope of local variable
Previous Message Michael Meskes 2019-02-21 09:02:40 Re: SQL statement PREPARE does not work in ECPG