Re: Yet another fast GiST build

From: Emre Hasegeli <emre(at)hasegeli(dot)com>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Peter Geoghegan <pg(at)bowt(dot)ie>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>, Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>
Subject: Re: Yet another fast GiST build
Date: 2021-07-05 06:27:50
Message-ID: CAE2gYzywrGtuhMcx06L+NthXtNncz_8nfOS15_OUCKPtqjs5yw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I tried reviewing the remaining patches. It seems to work correctly,
and passes the tests on my laptop.

> In this pattern I flipped PointerGetDatum(a) to PointerGetDatum(ra.lower), because it seems to me correct. I've followed rule of thumb: every sort function must extract and use "lower" somehow. Though I suspect numeric a bit. Is it regular varlena?

As far as I understand, we cannot use the sortsupport functions from
the btree operator classes because the btree_gist extension handles
things differently. This is unfortunate and a source of bugs [1], but
we cannot do anything about it.

Given that the lower and upper datums must be the same for the leaf
nodes, it makes sense to me to compare one of them.

Using numeric_cmp() for numeric in line with using bttextcmp() for text.

> + /*
> + * Numeric has abbreviation routines in numeric.c, but we don't try to use
> + * them here. Maybe later.
> + */

This is also true for text. Perhaps we should also add a comment there.

> PFA patchset with v6 intact + two fixes of discovered issues.

> + /* Use byteacmp(), like gbt_bitcmp() does */

We can improve this comment by incorporating Heikki's previous email:

> Ok, I think I understand that now. In btree_gist, the *_cmp() function
> operates on non-leaf values, and *_lt(), *_gt() et al operate on leaf
> values. For all other datatypes, the leaf and non-leaf representation is
> the same, but for bit/varbit, the non-leaf representation is different.
> The leaf representation is VarBit, and non-leaf is just the bits without
> the 'bit_len' field. That's why it is indeed correct for gbt_bitcmp() to
> just use byteacmp(), whereas gbt_bitlt() et al compares the 'bit_len'
> field separately. That's subtle, and 100% uncommented.

I think patch number 3 should be squashed to patch number 1.

I couldn't understand patch number 2 "Remove DEBUG1 verification". It
seems like something rather useful.

[1] https://www.postgresql.org/message-id/flat/201010112055.o9BKtZf7011251%40wwwmaster.postgresql.org

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiro Ikeda 2021-07-05 06:28:58 Re: Transactions involving multiple postgres foreign servers, take 2
Previous Message Bharath Rupireddy 2021-07-05 06:18:17 Re: "debug_invalidate_system_caches_always" is too long