Re: Yet another fast GiST build

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>, Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Yet another fast GiST build
Date: 2020-09-29 20:31:07
Message-ID: 282f7007-326a-f3d6-7394-0e5ea52df84f@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 29/09/2020 21:04, Andrey M. Borodin wrote:
> 28 сент. 2020 г., в 13:12, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> написал(а):
>> I did some testing with your patch. It seems that the rightlinks
>> are still not always set. I didn't try to debug why.
>
> Yes, there is a bug. Now it seems to me so obvious, yet it took some
> time to understand that links were shifted by one extra jump. PFA
> fixed rightlinks installation.
Ah, that was simple. I propose adding a comment on it:

--- a/src/backend/access/gist/gistbuild.c
+++ b/src/backend/access/gist/gistbuild.c
@@ -540,6 +540,19 @@ gist_indexsortbuild_pagestate_flush(GISTBuildState
*state,
/* Re-initialize the page buffer for next page on this level. */
pagestate->page = palloc(BLCKSZ);
gistinitpage(pagestate->page, isleaf ? F_LEAF : 0);
+
+ /*
+ * Set the right link to point to the previous page. This is just for
+ * debugging purposes: GiST only follows the right link if a page is split
+ * concurrently to a scan, and that cannot happen during index build.
+ *
+ * It's a bit counterintuitive that we set the right link on the new page
+ * to point to the previous page, and not the other way round. But GiST
+ * pages are not ordered like B-tree pages are, so as long as the
+ * right-links form a chain through all the pages in the same level, the
+ * order doesn't matter.
+ */
+ GistPageGetOpaque(pagestate->page)->rightlink = blkno;
}

> BTW some one more small thing: we initialise page buffers with
> palloc() and palloc0(), while first one is sufficient for
> gistinitpage().
Hmm. Only the first one, in gist_indexsortbuild(), but that needs to be
palloc0, because it's used to write an all-zeros placeholder for the
root page.

> Also I'm working on btree_gist opclasses and found out that new
> sortsupport function is not mentioned in gistadjustmembers(). I think
> this can be fixed with gist_btree patch.

Thanks!

> Your pageinspect patch seems very useful. How do you think, should we
> provide a way to find invalid tuples in GiST within
> gist_page_items()? At some point we will have to ask user to reindex
> GiSTs with invalid tuples.
You mean invalid tuples created by crash on PostgreSQL version 9.0 or
below, and pg_upgraded? I doubt there are any of those still around in
the wild. We have to keep the code to detect them, though.

It would be nice to improve gist_page_items() to display more
information about the items, although I wouldn't worry much about
invalid tuples. The 'gevel' extension that Oleg mentioned upthread does
more, it would be nice to incorporate that into pageinspect somehow.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2020-09-29 20:39:34 Re: Planner making bad choice in alternative subplan decision
Previous Message Tom Lane 2020-09-29 20:17:05 Re: Dumping/restoring fails on inherited generated column