Re: cost based vacuum (parallel)

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: cost based vacuum (parallel)
Date: 2019-11-15 02:53:49
Message-ID: CAA4eK1+uDgLwfnAhQWGpAe66D85PdkeBygZGVyX96+ovN1PbOg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 13, 2019 at 10:02 AM Masahiko Sawada
<masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
>
> I've done some tests while changing shared buffer size, delays and
> number of workers. The overall results has the similar tendency as the
> result shared by Dilip and looks reasonable to me.
>

Thanks, Sawada-san for repeating the tests. I can see from yours,
Dilip and Mahendra's testing that the delay is distributed depending
upon the I/O done by a particular worker and the total I/O is also as
expected in various kinds of scenarios. So, I think this is a better
approach. Do you agree or you think we should still investigate more
on another approach as well?

I would like to summarize this approach. The basic idea for parallel
vacuum is to allow the parallel workers and master backend to have a
shared view of vacuum cost related parameters (mainly
VacuumCostBalance) and allow each worker to update it and then based
on that decide whether it needs to sleep. With this basic idea, we
found that in some cases the throttling is not accurate as explained
with an example in my email above [1] and then the tests performed by
Dilip and others in the following emails (In short, the workers doing
more I/O can be throttled less). Then as discussed in an email later
[2], we tried a way to avoid letting the workers sleep which has done
less or no I/O as compared to other workers. This ensured that
workers who are doing more I/O got throttled more. The idea is to
allow any worker to sleep only if it has performed the I/O above a
certain threshold and the overall balance is more than the cost_limit
set by the system. Then we will allow the worker to sleep
proportional to the work done by it and reduce the
VacuumSharedCostBalance by the amount which is consumed by the current
worker. This scheme leads to the desired throttling by different
workers based on the work done by the individual worker.

We have tested this idea with various kinds of workloads like by
varying shared buffer size, delays and number of workers. Then also,
we have tried with a different number of indexes and workers. In all
the tests, we found that the workers are throttled proportional to the
I/O being done by a particular worker.

[1] - https://www.postgresql.org/message-id/CAA4eK1JvxBTWTPqHGx1X7in7j42ZYwuKOZUySzH3YMwTNRE-2Q%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/CAA4eK1K9kCqLKbVA9KUuuarjj%2BsNYqrmf6UAFok5VTgZ8evWoA%40mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-11-15 03:07:15 Re: Hypothetical indexes using BRIN broken since pg10
Previous Message Michael Paquier 2019-11-15 02:52:19 Re: Replication & recovery_min_apply_delay