Re: WAL insert delay settings

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>, Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL insert delay settings
Date: 2019-02-15 20:02:45
Message-ID: b0f56c30-107a-9a8b-0198-6ca6c1da873e@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 2/15/19 7:41 PM, Andres Freund wrote:
> Hi,
>
> On 2019-02-15 08:50:03 -0500, Stephen Frost wrote:
>> * Andres Freund (andres(at)anarazel(dot)de) wrote:
>>> On 2019-02-14 11:02:24 -0500, Stephen Frost wrote:
>>>> On Thu, Feb 14, 2019 at 10:15 Peter Eisentraut <
>>>> peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
>>>>> On 14/02/2019 11:03, Tomas Vondra wrote:
>>>>>> But if you add extra sleep() calls somewhere (say because there's also
>>>>>> limit on WAL throughput), it will affect how fast VACUUM works in
>>>>>> general. Yet it'll continue with the cost-based throttling, but it will
>>>>>> never reach the limits. Say you do another 20ms sleep somewhere.
>>>>>> Suddenly it means it only does 25 rounds/second, and the actual write
>>>>>> limit drops to 4 MB/s.
>>>>>
>>>>> I think at a first approximation, you probably don't want to add WAL
>>>>> delays to vacuum jobs, since they are already slowed down, so the rate
>>>>> of WAL they produce might not be your first problem. The problem is
>>>>> more things like CREATE INDEX CONCURRENTLY that run at full speed.
>>>>>
>>>>> That leads to an alternative idea of expanding the existing cost-based
>>>>> vacuum delay system to other commands.
>>>>>
>>>>> We could even enhance the cost system by taking WAL into account as an
>>>>> additional factor.
>>>>
>>>> This is really what I was thinking- let’s not have multiple independent
>>>> ways of slowing down maintenance and similar jobs to reduce their impact on
>>>> I/o to the heap and to WAL.
>>>
>>> I think that's a bad idea. Both because the current vacuum code is
>>> *terrible* if you desire higher rates because both CPU and IO time
>>> aren't taken into account. And it's extremely hard to control. And it
>>> seems entirely valuable to be able to limit the amount of WAL generated
>>> for replication, but still try go get the rest of the work done as
>>> quickly as reasonably possible wrt local IO.
>>
>> I'm all for making improvements to the vacuum code and making it easier
>> to control.
>>
>> I don't buy off on the argument that there is some way to segregate the
>> local I/O question from the WAL when we're talking about these kinds of
>> operations (VACUUM, CREATE INDEX, CLUSTER, etc) on logged relations, nor
>> do I think we do our users a service by giving them independent knobs
>> for both that will undoubtably end up making it more difficult to
>> understand and control what's going on overall.
>>
>> Even here, it seems, you're arguing that the existing approach for
>> VACUUM is hard to control; wouldn't adding another set of knobs for
>> controlling the amount of WAL generated by VACUUM make that worse? I
>> have a hard time seeing how it wouldn't.
>
> I think it's because I see them as, often, having two largely
> independent use cases. If your goal is to avoid swamping replication
> with WAL, you don't necessarily care about also throttling VACUUM
> (or REINDEX, or CLUSTER, or ...)'s local IO. By forcing to combine
> the two you just make the whole feature less usable.
>

I agree with that.

> I think it'd not be insane to add two things:
> - WAL write rate limiting, independent of the vacuum stuff. It'd also be
> used by lots of other bulk commands (CREATE INDEX, ALTER TABLE
> rewrites, ...)
> - Account for WAL writes in the current vacuum costing logic, by
> accounting for it using a new cost parameter
>
> Then VACUUM would be throttled by the *minimum* of the two, which seems
> to make plenty sense to me, given the usecases.
>

Is it really minimum? If you add another cost parameter to the vacuum
model, then there's almost no chance of actually reaching the limit
because the budget (cost_limit) is shared with other stuff (local I/O).

FWIW I do think the ability to throttle WAL is a useful feature, I just
don't want to shoot myself in the foot by making other things worse.

As you note, the existing VACUUM throttling is already hard to control,
this seems to make it even harder.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2019-02-15 21:02:56 Re: propagating replica identity to partitions
Previous Message Alvaro Herrera 2019-02-15 18:53:28 Re: shared-memory based stats collector