Re: GiST VACUUM

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Костя Кузнецов <chapaev28(at)ya(dot)ru>
Subject: Re: GiST VACUUM
Date: 2018-07-18 21:12:15
Message-ID: 66dffd3a-b9d5-65cf-15bf-615a83b44301@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 18/07/18 21:27, Andrey Borodin wrote:
> Hi!
>
>> 18 июля 2018 г., в 16:02, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
>> написал(а):
>>
>> , but I think it would be better to split this into two patches as
>> follows:
>>
>> 1st patch: Scan the index in physical rather than logical order. No
>> attempt at deleting empty pages yet.
>>
>> 2nd patch: Add support for deleting empty pages.
>>
>> I would be more comfortable reviewing and committing that first
>> patch, which just switches to doing physical-order scan, first.
>
> This seems very unproportional division of complexity. First patch
> (PFA) is very simple. All work is done in one cycle, without
> memorizing anything. Actually, you do not even need to rescan
> rightlinks: there may be no splits to the left when no pages are
> deleted.

Heh, good point.

I googled around and bumped into an older patch to do this:
https://www.postgresql.org/message-id/1135121410099068%40web30j.yandex.ru.
Unfortunately, Костя never got around to update the patch, and it was
forgotten. But the idea seemed sound even back then.

As noted in that thread, there might be deleted pages in the index in
some rare circumstances, even though we don't recycled empty pages: if
the index was upgraded from a very old version, as VACUUM FULL used to
recycle empty pages, or if you crash just when extending the index, and
end up with newly-initialized but unused pages that way. So we do need
to handle the concurrent split scenario, even without empty page recycling.

> If you think it is proper way to go - OK, I'll prepare
> better version of attached diff (by omitting tail recursion and
> adding more comments).

Yeah, please, I think this is the way to go.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kefan Yang 2018-07-18 21:13:38 RE: GSOC 2018 Project - A New Sorting Routine
Previous Message Alvaro Herrera 2018-07-18 21:05:44 Re: psql's \d versus included-index-column feature