From: | Arseniy Mukhin <arseniy(dot)mukhin(dot)dev(at)gmail(dot)com> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru> |
Subject: | Re: amcheck support for BRIN indexes |
Date: | 2025-07-05 20:19:35 |
Message-ID: | CAE7r3MKciyBRjcWz0Wpi1ShU8pbHxEoUTj6FQYUdMpNdAbrCrw@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi!
On Wed, Jun 18, 2025 at 11:33 AM Arseniy Mukhin
<arseniy(dot)mukhin(dot)dev(at)gmail(dot)com> wrote:
> ...
> On Mon, Jun 16, 2025 at 8:11 PM Andrey Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
> ...
> > If it's not very difficult - it would be great to use read_stream infrastructure. See btvacuumscan() in nbtree.c calling read_stream_begin_relation() for example. We cannot use it in logical scans in B-tree\GiST\GIN, but maybe BRIN can take some advantage of this new shiny technology.
>
> Thanks, I will look into it.
You are right, it was not very difficult to replace index sequential
scans with read_streams. Hope I picked correct stream_flag values.
Thank you!
On Sun, Jun 22, 2025 at 12:55 AM Arseniy Mukhin
<arseniy(dot)mukhin(dot)dev(at)gmail(dot)com> wrote:
> ...
> 1) We can use the addValue() function. It returns FALSE if range data
> was not changed (in other words range data already covers heap tuple
> data that we pass to the function). It's very easy to do, we can use
> heap tuples directly. But the problem I see here is that addValue()
> can return TRUE even if heap tuple data have been already covered by
> the range, but range data itself changed for some reason (some new
> algorithm were applied for instance). So I think we can have some
> false positives that we can do nothing about.
>
And yes, it's not an option really. It turned out that minmax_multi
can return true from addValue() even if it already contains the value.
So we can drop this option.
> ...
> 3) The approach that seems to me the most clear and straightforward:
> to add new optional function to BRIN opclass API. The function that
> would say if passed value is covered with the current range data. it's
> exactly what we want to know, so we can use heap data directly here.
> Something like that:
>
> bool withinRange(BrinDesc *bdesc, BrinValues *column, Datum val, bool isnull)
>
> It could be an optional function that will be implemented for all core
> BRIN opclasses. So if somebody wants to use it for some custom opclass
> they will need to implement it too, but it's not required. I
> understand that adding something to the index opclass API requires
> very strong arguments. So the argument here is that it will let to do
> brin check very robust (without possible false positives as in the
> first approach) and easy to use (no additional parameters in the check
> function). Also, the withinRange() function could be written in such a
> way that it would be more efficient for our task than addValue() or
> consistent().
I decided to give it a try and implement such a support function. It
was not very difficult since all necessary logic already exists in
addValue() and consistent() functions for all core operator classes.
The main doubt about this approach: we add something to the core just
to use it in the contrib module. But the logic of this method is very
common with what we already have there and probably it is not possible
to implement it outside of the core, because you need all opclass
internals etc.
So there is a new version. I renamed 'heap all consistent' -> 'heap
all indexed', as btree amcheck calls it. I think there is not much
point in using another name here. There are two new files:
0004 - adds new BRIN support function (withinRange).
0005 - migrate 'heap all indexed' from using consistent function to
new withinRange function.
Patch 0003 still has the old 'heap all indexed' implementation that
uses a consistent function (2nd approach). So If you want to have
'heap all indexed' using a consistent function - don't apply 0004 and
0005 patches.
Best regards,
Arseniy Mukhin
Attachment | Content-Type | Size |
---|---|---|
v5-0002-amcheck-brin_index_check-index-structure-check.patch | text/x-patch | 46.9 KB |
v5-0005-using-withinRange-function-for-heap-all-indexed-c.patch | text/x-patch | 17.0 KB |
v5-0003-amcheck-brin_index_check-heap-all-indexed.patch | text/x-patch | 29.7 KB |
v5-0001-brin-refactoring.patch | text/x-patch | 3.0 KB |
v5-0004-Adds-new-BRIN-support-function-withinRange.patch | text/x-patch | 52.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tatsuo Ishii | 2025-07-05 22:07:53 | Re: NegotiateProtocolVersion description |
Previous Message | Tom Lane | 2025-07-05 18:00:07 | Re: A assert failure when initdb with track_commit_timestamp=on |