Re: [HACKERS] Block level parallel vacuum

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Mahendra Singh Thalor <mahi6run(at)gmail(dot)com>, Amit Langote <langote_amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Claudio Freire <klaussfreire(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Sergei Kornilov <sk(at)zsrv(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Subject: Re: [HACKERS] Block level parallel vacuum
Date: 2020-01-18 20:44:52
Message-ID: CAH2-WznCY7aQxw6_+1OmD-=b11YEAkqB+rwXcqhQQWVX7xwgPA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 17, 2020 at 1:18 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> Thanks for doing this test again. In the attached patch, I have
> addressed all the comments and modified a few comments.

I am in favor of the general idea of parallel VACUUM that parallelizes
the processing of each index (I haven't looked at the patch, though).
I observed something during a recent benchmark of the deduplication
patch that seems like it might be relevant to parallel VACUUM. This
happened during a recreation of the original WARM benchmark, which is
described here:

https://www.postgresql.org/message-id/CABOikdMNy6yowA%2BwTGK9RVd8iw%2BCzqHeQSGpW7Yka_4RSZ_LOQ%40mail.gmail.com

(There is an extra pgbench_accounts index on abalance, plus 4 indexes
on large text columns with filler MD5 hashes, all of which are
random.)

On the master branch, I can clearly observe that the "filler" MD5
indexes are bloated to a degree that is affected by the order of their
original creation/pg_class OID order. These are all indexes that
become bloated purely due to "version churn" -- or what I like to call
"unnecessary" page splits. The keys used in each pgbench_accounts
logical row never change, except in the case of the extra abalance
index (the idea is to prevent all HOT updates without ever updating
most indexed columns). I noticed that pgb_a_filler1 is a bit less
bloated than pgb_a_filler2, which is a little less bloated than
pgb_a_filler3, which is a little less bloated than pgb_a_filler4. Even
after 4 hours, and even though the "shape" of each index is identical.
This demonstrates an important general principle about vacuuming
indexes: timeliness can matter a lot.

In general, a big benefit of the deduplication patch is that it "buys
time" for VACUUM to run before "unnecessary" page splits can occur --
that is why the deduplication patch prevents *all* page splits in
these "filler" indexes, whereas on the master branch the filler
indexes are about 2x larger (the exact amount varies based on VACUUM
processing order, at least earlier on).

For tables with several indexes, giving each index its own VACUUM
worker process will prevent "unnecessary" page splits caused by
version churn, simply because VACUUM will start to clean each index
sooner than it would compared to serial processing (except for the
"lucky" first index). There is no "lucky" first index that gets
preferential treatment -- presumably VACUUM will start processing each
index at the same time with this patch, making each index equally
"lucky".

I think that there may even be a *complementary* effect with parallel
VACUUM, though I haven't tested that theory. Deduplication "buys time"
for VACUUM to run, while at the same time VACUUM takes less time to
show up and prevent "unnecessary" page splits. My guess is that these
two seemingly unrelated patches may actually address this "unnecessary
page split" problem from two completely different angles, with an
overall effect that is greater than the sum of its parts.

While the difference in size of each filler index on the master branch
wasn't that significant on its own, it's still interesting. It's
probably quite workload dependent.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2020-01-18 20:52:21 Re: should crash recovery ignore checkpoint_flush_after ?
Previous Message Justin Pryzby 2020-01-18 20:11:12 Re: should crash recovery ignore checkpoint_flush_after ?