Re: Yet another fast GiST build

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>, Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Yet another fast GiST build
Date: 2020-09-28 08:12:02
Message-ID: 4098005f-e4b5-5904-4b33-335dd106c132@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 21/09/2020 17:19, Andrey M. Borodin wrote:
>> 21 сент. 2020 г., в 18:29, Andrey M. Borodin <x4mmm(at)yandex-team(dot)ru> написал(а):
>>
>> It was a conscious decision with incorrect motivation. I was thinking that it will help to reduce number of "false positive" inspecting right pages. But now I see that:
>> 1. There should be no such "false positives" that we can avoid
>> 2. Valid rightlinks could help to do amcheck verification in future
>
> Well, point number 2 here is invalid. There exist one leaf page p, so that if we start traversing rightlink from p we will reach all leaf pages. But we practically have no means to find this page. This makes rightlinks not very helpful in amcheck for GiST.

Well, if you store all the right links in a hash table or something, you
can "connect the dots" after you have scanned all the pages to see that
the chain is unbroken. Probably would not be worth the trouble, since
the rightlinks are not actually needed after concurrent scans have
completed.

> But for consistency I think it worth to install them.

I agree. I did some testing with your patch. It seems that the
rightlinks are still not always set. I didn't try to debug why.

I wrote a couple of 'pageinspect' function to inspect GiST pages for
this. See attached. I then created a test table and index like this:

create table points (p point);
insert into points select point(x,y) from generate_series(-2000, 2000)
x, generate_series(-2000, 2000) y;
create index points_idx on points using gist (p);

And this is what the root page looks like:

postgres=# select * from gist_page_items(get_raw_page('points_idx', 0));
itemoffset | ctid | itemlen
------------+---------------+---------
1 | (27891,65535) | 40
2 | (55614,65535) | 40
3 | (83337,65535) | 40
4 | (97019,65535) | 40
(4 rows)

And the right links on the next level:

postgres=# select * from (VALUES (27891), (55614), (83337), (97019)) b
(blkno), lateral gist_page_opaque_info(get_raw_page('points_idx', blkno));
blkno | lsn | nsn | rightlink | flags
-------+-----+-----+------------+-------
27891 | 0/1 | 0/0 | 4294967295 | {}
55614 | 0/1 | 0/0 | 4294967295 | {}
83337 | 0/1 | 0/0 | 27891 | {}
97019 | 0/1 | 0/0 | 55614 | {}
(4 rows)

I expected there to be only one page with invalid right link, but there
are two.

- Heikki

Attachment Content-Type Size
0001-Add-functions-to-pageinspect-to-inspect-GiST-indexes.patch text/x-patch 7.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-09-28 08:36:56 Re: [HACKERS] logical decoding of two-phase transactions
Previous Message tsunakawa.takay@fujitsu.com 2020-09-28 08:07:40 RE: [Patch] Optimize dropping of relation buffers using dlist