Re: [HACKERS] Block level parallel vacuum

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Mahendra Singh <mahi6run(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Block level parallel vacuum
Date: 2019-10-25 02:41:45
Message-ID: CAA4eK1KkD-CFuYhy0pi+NSR6FUzWYO0gyt2ux7T39uL_Pc3-0w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 25, 2019 at 7:37 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Oct 24, 2019 at 8:12 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Thu, Oct 24, 2019 at 3:21 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > >
> > > I have come up with the POC for approach (a).
> > >
> > > The idea is
> > > 1) Before launching the worker divide the current VacuumCostBalance
> > > among workers so that workers start accumulating the balance from that
> > > point.
> > > 2) Also, divide the VacuumCostLimit among the workers.
> > > 3) Once the worker are done with the index vacuum, send back the
> > > remaining balance with the leader.
> > > 4) The leader will sum all the balances and add that to its current
> > > VacuumCostBalance. And start accumulating its balance from this
> > > point.
> > >
> > > I was trying to test how is the behaviour of the vacuum I/O limit, but
> > > I could not find an easy way to test that so I just put the tracepoint
> > > in the code and just checked that at what point we are giving the
> > > delay.
> > > I also printed the cost balance at various point to see that after how
> > > much I/O accumulation we are hitting the delay. Please feel free to
> > > suggest a better way to test this.
> > >
> > > I have printed these logs for parallel vacuum patch (v30) vs v(30) +
> > > patch for dividing i/o limit (attached with the mail)
> > >
> > > Note: Patch and the test results are attached.
> > >
> >
> > Thank you!
> >
> > For approach (a) the basic idea I've come up with is that we have a
> > shared balance value on DSM and each workers including the leader
> > process add its local balance value to it in vacuum_delay_point, and
> > then based on the shared value workers sleep. I'll submit that patch
> > with other updates.
> >
>
> I think it would be better if we can prepare the I/O balance patches
> on top of main patch and evaluate both approaches. We can test both
> the approaches and integrate the one which turned out to be good.
>

Just to add something to testing both approaches. I think we can
first come up with a way to compute the throttling vacuum does as
mentioned by me in one of the emails above [1] or in some other way.
I think Dilip is planning to give it a try and once we have that we
can evaluate both the patches. Some of the tests I have in mind are:
a. All indexes have an equal amount of deleted data.
b. indexes have an uneven amount of deleted data.
c. try with mix of indexes (btree, gin, gist, hash, etc..) on a table.

Feel free to add more tests.

[1] - https://www.postgresql.org/message-id/CAA4eK1%2BPeiFLdTuwrE6CvbNdx80E-O%3DZxCuWB2maREKFD-RaCA%40mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-10-25 02:46:36 Re: EXPLAIN BUFFERS and I/O timing accounting questions
Previous Message tsunakawa.takay@fujitsu.com 2019-10-25 02:11:55 RE: Fix of fake unlogged LSN initialization