Re: cost based vacuum (parallel)

From: Andres Freund <andres(at)anarazel(dot)de>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: cost based vacuum (parallel)
Date: 2019-11-04 18:11:57
Message-ID: 20191104181157.fjn2o6h53gxpjuvy@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-11-04 12:24:35 +0530, Amit Kapila wrote:
> For parallel vacuum [1], we were discussing what is the best way to
> divide the cost among parallel workers but we didn't get many inputs
> apart from people who are very actively involved in patch development.
> I feel that we need some more inputs before we finalize anything, so
> starting a new thread.
>
> The initial version of the patch has a very rudimentary way of doing
> it which means each parallel vacuum worker operates independently
> w.r.t vacuum delay and cost.

Yea, that seems not ok for cases where vacuum delay is active.

There's also the question of when/why it is beneficial to use
parallelism when you're going to encounter IO limits in all likelihood.

> This will lead to more I/O in the system
> than the user has intended to do. Assume that the overall I/O allowed
> for vacuum operation is X after which it will sleep for some time,
> reset the balance and continue. In the patch, each worker will be
> allowed to perform X before which it can sleep and also there is no
> coordination for the same with master backend which would have done
> some I/O for the heap. So, in the worst-case scenario, there can be n
> times more I/O where n is the number of workers doing the parallel
> operation. This is somewhat similar to a memory usage problem with a
> parallel query where each worker is allowed to use up to work_mem of
> memory. We can say that the users using parallel operation can expect
> more system resources to be used as they want to get the operation
> done faster, so we are fine with this. However, I am not sure if that
> is the right thing, so we should try to come up with some solution for
> it and if the solution is too complex, then probably we can think of
> documenting such behavior.

I mean for parallel query the problem wasn't really introduced in
parallel query, it existed before - and does still - for non-parallel
queries. And there's a complex underlying planning issue. I don't think
this is a good excuse for VACUUM, where none of the complex "number of
paths considered" issues etc apply.

> The two approaches to solve this problem being discussed in that
> thread [1] are as follows:
> (a) Allow the parallel workers and master backend to have a shared
> view of vacuum cost related parameters (mainly VacuumCostBalance) and
> allow each worker to update it and then based on that decide whether
> it needs to sleep. Sawada-San has done the POC for this approach.
> See v32-0004-PoC-shared-vacuum-cost-balance in email [2]. One
> drawback of this approach could be that we allow the worker to sleep
> even though the I/O has been performed by some other worker.

I don't understand this drawback.

> (b) The other idea could be that we split the I/O among workers
> something similar to what we do for auto vacuum workers (see
> autovac_balance_cost). The basic idea would be that before launching
> workers, we need to compute the remaining I/O (heap operation would
> have used something) after which we need to sleep and split it equally
> across workers. Here, we are primarily thinking of dividing
> VacuumCostBalance and VacuumCostLimit parameters. Once the workers
> are finished, they need to let master backend know how much I/O they
> have consumed and then master backend can add it to it's current I/O
> consumed. I think we also need to rebalance the cost of remaining
> workers once some of the worker's exit. Dilip has prepared a POC
> patch for this, see 0002-POC-divide-vacuum-cost-limit in email [3].

(b) doesn't strike me as advantageous. It seems quite possible that you
end up with one worker that has a lot more IO than others, leading to
unnecessary sleeps, even though the actually available IO budget has not
been used up. Quite easy to see how that'd lead to parallel VACUUM
having a lower throughput than a single threaded one.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-11-04 18:28:29 Re: cost based vacuum (parallel)
Previous Message Tom Lane 2019-11-04 18:07:33 Re: Missed check for too-many-children in bgworker spawning