Re: GUC for cleanup indexes threshold.

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: pg(at)bowt(dot)ie
Cc: klaussfreire(at)gmail(dot)com, andres(at)anarazel(dot)de, sawada(dot)mshk(at)gmail(dot)com, robertmhaas(at)gmail(dot)com, david(at)pgmasters(dot)net, amit(dot)kapila16(at)gmail(dot)com, simon(at)2ndquadrant(dot)com, ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com, pgsql-hackers(at)postgresql(dot)org, kuntalghosh(dot)2007(at)gmail(dot)com
Subject: Re: GUC for cleanup indexes threshold.
Date: 2017-09-21 08:49:57
Message-ID: 20170921.174957.236914340.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

At Tue, 19 Sep 2017 16:55:38 -0700, Peter Geoghegan <pg(at)bowt(dot)ie> wrote in <CAH2-Wzn0-3zxGRp_qp1OaEXY7h1W0-W_VCFO0nDv0K_+kabyYQ(at)mail(dot)gmail(dot)com>
> On Tue, Sep 19, 2017 at 4:47 PM, Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
> > Maybe this is looking at the problem from the wrong direction.
> >
> > Why can't the page be added to the FSM immediately and the check be
> > done at runtime when looking for a reusable page?
> >
> > Index FSMs currently store only 0 or 255, couldn't they store 128 for
> > half-recyclable pages and make the caller re-check reusability before
> > using it?
>
> No, because it's impossible for them to know whether or not the page
> that their index scan just landed on recycled just a second ago, or
> was like this since before their xact began/snapshot was acquired.
>
> For your reference, this RecentGlobalXmin interlock stuff is what
> Lanin & Shasha call "The Drain Technique" within "2.5 Freeing Empty
> Nodes". Seems pretty hard to do it any other way.

Anyway(:p) the attached first patch is a PoC for the
cleanup-state-in-stats method works only for btree. Some
LOG-level debugging messages are put in the patch to show how it
works.

The following steps makes a not-recyclable page but I'm not sure
it is general enough, and I couldn't generate half-dead pages.
The pg_sleep() in the following steps is inserted in order to see
the updated values in stats.

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (a int);
CREATE INDEX ON t1 (a);
INSERT INTO t1 (SELECT a FROM generate_series(0, 800000) a);
DELETE FROM t1 WHERE a > 416700 AND a < 417250;
VACUUM t1;
DELETE FROM t1;
VACUUM t1; -- 1 (or wait for autovacuum)
select pg_sleep(1);
VACUUM t1; -- 2 (autovacuum doesn't work)
select pg_sleep(1);
VACUUM t1; -- 3 (ditto)

The following logs are emited while the three VACUUMs are issued.

# VACUUM t1; -- 1 (or wait for autovacuum)
LOG: btvacuumscan(t1_a_idx) result: deleted = 2185, notrecyclable = 1, hafldead = 0, no_cleanup_needed = false
LOG: Vacuum cleanup of index t1_a_idx is NOT skipped
LOG: btvacuumcleanup on index t1_a_idx is skipped since bulkdelete has run just before.
# VACUUM t1; -- 2
LOG: Vacuum cleanup of index t1_a_idx is NOT skipped
LOG: btvacuumscan(t1_a_idx) result: deleted = 2192, notrecyclable = 0, hafldead = 0, no_cleanup_needed = true
# VACUUM t1; -- 3
LOG: Vacuum cleanup of index t1_a_idx is skipped

VACUUM #1 leaves a unrecyclable page and requests the next cleanup.
VACUUM #2 leaves no unrecyclable page and inhibits the next cleanup.
VACUUM #3 (and ever after) no vacuum cleanup executed.

# I suppose it is a known issue that the cleanup cycles are not
# executed automatically unless new dead tuples are generated.

- Getting stats takes a very long time to fail during
initdb. Since I couldn't find the right way to cope with this,
I added a tentative function pgstat_live(), which checks that
the backend has a valid stats socket.

- The patch calls pg_stat_get_vac_cleanup_needed using
DirectFunctionCall. It might be better be wrapped.

As a byproduct, this enables us to run extra autovacuum rounds fo
r index cleanup. With the second attached, autovacuum works as
follows.

DROP TABLE IF EXISTS t1;
CREATE TABLE t1 (a int);
CREATE INDEX ON t1 (a);
INSERT INTO t1 (SELECT a FROM generate_series(0, 800000) a);
DELETE FROM t1 WHERE a > 416700 AND a < 417250;
(autovacuum on t1 runs)
> LOG: btvacuumscan(t1_a_idx) result: deleted = 0, notrecyclable = 0, hafldead = 0, no_cleanup_needed = true
> LOG: Vacuum cleanup of index t1_a_idx is skipped
> LOG: automatic vacuum of table "postgres.public.t1": index scans: 1
DELETE FROM t1;
(autovacuum on t1 runs)
> LOG: btvacuumscan(t1_a_idx) result: deleted = 2185, notrecyclable = 1, hafldead = 0, no_cleanup_needed = false
> LOG: Vacuum cleanup of index t1_a_idx is NOT skipped
> LOG: btvacuumcleanup on index t1_a_idx is skipped since bulkdelete has run just before.
> LOG: automatic vacuum of table "postgres.public.t1": index scans: 1
(cleanup vacuum runs for t1 in the next autovac timing)
> LOG: Vacuum cleanup of index t1_a_idx is NOT skipped
> LOG: btvacuumscan(t1_a_idx) result: deleted = 2192, notrecyclable = 0, hafldead = 0, no_cleanup_needed = true
> LOG: automatic vacuum of table "postgres.public.t1": index scans: 0

Any suggestions are welcome.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
0001_vacindex_skip_cleanup_v0.patch text/x-patch 19.7 KB
0002_autovac_for_index_cleanup_v0.patch text/x-patch 2.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2017-09-21 08:52:45 Re: [Proposal] Make the optimiser aware of partitions ordering
Previous Message Masahiko Sawada 2017-09-21 08:43:03 Re: DROP SUBSCRIPTION hangs if sub is disabled in the same transaction