Re: GiST VACUUM

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Костя Кузнецов <chapaev28(at)ya(dot)ru>
Subject: Re: GiST VACUUM
Date: 2019-01-02 15:35:15
Message-ID: d7e89e1b-efd4-1464-f03d-4ba4ea9e1835@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 28/10/2018 19:32, Andrey Borodin wrote:
> Hi everyone!
>
>> 2 окт. 2018 г., в 6:14, Michael Paquier <michael(at)paquier(dot)xyz> написал(а):
>> Andrey, your latest patch does not apply. I am moving this to the next
>> CF, waiting for your input.
>
> I'm doing preps for CF.
> Here's rebased version.

Thanks, I had another look at these.

In patch #1, to do the vacuum scan in physical order:

* The starting NSN was not acquired correctly for unlogged and temp
relations. They don't use WAL, so their NSN values are based on the
'unloggedLSN' counter, rather than current WAL insert pointer. So must
use gistGetFakeLSN() rather than GetInsertRecPtr() for them. Fixed that.

* Adjusted comments a little bit, mostly by copy-pasting the
better-worded comments from the corresponding nbtree code, and ran pgindent.

I think this is ready to be committed, except that I didn't do any
testing. We discussed the need for testing earlier. Did you write some
test scripts for this, or how have you been testing?

Patch #2:

* Bitmapset stores 32 bit signed integers, but BlockNumber is unsigned.
So this will fail with an index larger than 2^31 blocks. That's perhaps
academical, I don't think anyone will try to create a 16 TB GiST index
any time soon. But it feels wrong to introduce an arbitrary limitation
like that.

* I was surprised that the bms_make_empty() function doesn't set the
'nwords' field to match the allocated size. Perhaps that was
intentional, so that you don't need to scan the empty region at the end,
when you scan through all matching bits? Still, seems surprising, and
needs a comment at least.

* We're now scanning all internal pages in the 2nd phase. Not only those
internal pages that contained downlinks to empty leaf pages. That's
probably OK, but the comments need updating on that.

* In general, comments need to be fixed and added in several places. For
example, there's no clear explanation on what the "delete XID", stored
in pd_prune_xid, means. (I know what it is, because I'm familiar with
the same mechanism in B-tree, but it's not clear from the code itself.)

These can be fixed, they're not show-stoppers, but patch #2 isn't quite
ready yet.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2019-01-02 15:36:33 Re: GiST VACUUM
Previous Message Mark 2019-01-02 15:28:02 Re: Query planner / Analyse statistics bad estimate rows=1 with maximum statistics 10000 on PostgreSQL 10.2