From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Костя Кузнецов <chapaev28(at)ya(dot)ru>
Subject: Re: GiST VACUUM
Date: 2018-07-18 12:02:05
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 17/07/18 21:41, Andrey Borodin wrote:
> I was checking WAL replay of new scheme to log page deletes and found
> a bug there (incorrect value of deleted downlink in WAL record).
> Here's fixed patch v10.
> Also I've added support to WAL identification for new record, done
> some improvements to comments and naming in data structures.


> + /*
> + * If this page was splitted after start of the VACUUM we have to
> + * revisit rightlink, if it points to block we already scanned.
> + * This is recursive revisit, should not be deep, but we check
> + * the possibility of stack overflow anyway.
> + */
> + if ((GistFollowRight(page) || startNSN < GistPageGetNSN(page)) &&
> + (opaque->rightlink != InvalidBlockNumber) && (opaque->rightlink < blkno))
> + {
> + gistbulkdeletephysicalcanpage(info, stats, callback, callback_state, opaque->rightlink, startNSN, graph);
> + }

In the corresponding B-tree code, we use don't do actual recursion, but
a hand-optimized "tail recursion", to avoid stack overflow if there are
a lot of splits. I think we need to do something like tha there, too. I
don't think it's safe to assume that we have enough stack space for the
recursion. You have a check_stack_depth() in the function, so you'll get
ERROR, but it sucks if VACUUM errors out because of that.

It's not cool to use up to 'maintenance_work_mem' of memory for holding
the in-memory graph. VACUUM already uses up to that much memory to hold
the list of dead TIDs, so we would use up to 2x maintenance_work_mem in

The only reason we still need the logical scan is to support page
deletion, when there is not enough memory available. Can we get rid of
that altogether? I think I'd prefer some other scheme to deal with that
situation. For example, we could make note, in memory, of the empty
pages during the physical scan, and perform a second physical scan of
the index to find their downlinks. Two full-index scans is not nice, but
it's actually not that bad if it's done in physical order. And you could
have some heuristics, e.g. only do it if more than 10% of the pages were

Sorry to ask you to rewrite this again, but I think it would be better
to split this into two patches as follows:

1st patch: Scan the index in physical rather than logical order. No
attempt at deleting empty pages yet.

2nd patch: Add support for deleting empty pages.

I would be more comfortable reviewing and committing that first patch,
which just switches to doing physical-order scan, first.

- Heikki

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2018-07-18 12:25:46 Re: Make foo=null a warning by default.
Previous Message Etsuro Fujita 2018-07-18 12:00:41 Re: Expression errors with "FOR UPDATE" and postgres_fdw with partition wise join enabled.