Re: cost based vacuum (parallel)

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: cost based vacuum (parallel)
Date: 2019-11-06 06:44:42
Message-ID: CAA4eK1K9kCqLKbVA9KUuuarjj+sNYqrmf6UAFok5VTgZ8evWoA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 5, 2019 at 11:28 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Mon, Nov 4, 2019 at 11:42 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> >
> >
> > > The two approaches to solve this problem being discussed in that
> > > thread [1] are as follows:
> > > (a) Allow the parallel workers and master backend to have a shared
> > > view of vacuum cost related parameters (mainly VacuumCostBalance) and
> > > allow each worker to update it and then based on that decide whether
> > > it needs to sleep. Sawada-San has done the POC for this approach.
> > > See v32-0004-PoC-shared-vacuum-cost-balance in email [2]. One
> > > drawback of this approach could be that we allow the worker to sleep
> > > even though the I/O has been performed by some other worker.
> >
> > I don't understand this drawback.
> >
>
> I think the problem could be that the system is not properly throttled
> when it is supposed to be. Let me try by a simple example, say we
> have two workers w-1 and w-2. The w-2 is primarily doing the I/O and
> w-1 is doing very less I/O but unfortunately whenever w-1 checks it
> finds that cost_limit has exceeded and it goes for sleep, but w-1
> still continues.
>

Typo in the above sentence. /but w-1 still continues/but w-2 still continues.

> Now in such a situation even though we have made one
> of the workers slept for a required time but ideally the worker which
> was doing I/O should have slept. The aim is to make the system stop
> doing I/O whenever the limit has exceeded, so that might not work in
> the above situation.
>

One idea to fix this drawback is that if we somehow avoid letting the
workers sleep which has done less or no I/O as compared to other
workers, then we can to a good extent ensure that workers which are
doing more I/O will be throttled more. What we can do is to allow any
worker sleep only if it has performed the I/O above a certain
threshold and the overall balance is more than the cost_limit set by
the system. Then we will allow the worker to sleep proportional to
the work done by it and reduce the VacuumSharedCostBalance by the
amount which is consumed by the current worker. Something like:

If ( VacuumSharedCostBalance >= VacuumCostLimit &&
MyCostBalance > (threshold) VacuumCostLimit / workers)
{
VacuumSharedCostBalance -= MyCostBalance;
Sleep (delay * MyCostBalance/VacuumSharedCostBalance)
}

Assume threshold be 0.5, what that means is, if it has done work more
than 50% of what is expected from this worker and the overall share
cost balance is exceeded, then we will consider this worker to sleep.

What do you guys think?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message btfujiitkp 2019-11-06 07:14:09 Re: Allow CREATE OR REPLACE VIEW to rename the columns
Previous Message Etsuro Fujita 2019-11-06 06:12:04 Re: [PATCH][DOC] Fix for PREPARE TRANSACTION doc and postgres_fdw message.