Re: amcheck support for BRIN indexes

From: Arseniy Mukhin <arseniy(dot)mukhin(dot)dev(at)gmail(dot)com>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: amcheck support for BRIN indexes
Date: 2025-06-18 08:33:38
Message-ID: CAE7r3MJh4qD3SE5ZvBhz-Ae8F7C=K5ZtzyxZQcbEvV5RZwmRRg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 9, 2025 at 1:16 AM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>
> On 6/8/25 14:39, Arseniy Mukhin wrote:
> > Hi,
> >
> > Here is a new version.
> >
> > TAP tests were added. Tried to reproduce more or less every violation.
> > I don't include 2 tests where disk representation ItemIdData needs to
> > be changed: it works locally, but I don't think these tests are
> > portable. While writing tests some minor issues were found and fixed.
> > Also ci compiler warnings were fixed.
> >
>
> Thanks. I've added myself as a reviewer, so that I don't forget about
> this for the next CF.
>

Thank you, looking forward to hearing your thoughts.

On Mon, Jun 16, 2025 at 8:11 PM Andrey Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
>
> Hi!
>
> Nice feature!
>

Hi Andrey, thank you for your interest in the patch!

> > On 8 Jun 2025, at 17:39, Arseniy Mukhin <arseniy(dot)mukhin(dot)dev(at)gmail(dot)com> wrote:
> >
> > <v2-0001-brin-refactoring.patch>
>
>
> +#define BRIN_MAX_PAGES_PER_RANGE 131072
>
> I'd personally prefer some words on where does this limit come from. I'm not a BRIN pro, just curious.
> Or, at least, maybe we can use a form 128 * 1024, if it's just a sane limit.
>

Actually I don't know where this value came from, this limit already
exists in reloptions.c, so patch just creates macros so that it can be
reused in the check without duplication.

>
> > On 8 Jun 2025, at 17:39, Arseniy Mukhin <arseniy(dot)mukhin(dot)dev(at)gmail(dot)com> wrote:
> > <v2-0002-amcheck-brin-support.patch>
>
>
> A bit more detailed commit message would be very useful.
>

Agree, it was too concise. Added commit messages.

> The test coverage is very decent. The number of possible corruptions in tests is impressive. I don't really have an experience with BRIN to say how different they are, but I want to ask if you are sure that these types of corruption will be portable across big-endian machines and such stuff.
>

Yeah, it seems that there are a lot of things that can go wrong with
the test's portability. What I think can be done to make it more
robust:
- using regular expressions to find places we want to corrupt instead
of concrete offsets. This way we avoid problems with different
alignments for 32 bit and 64 bit systems.
- using perl pack() function, so it uses the endianness of the system
where you run tests. This helps to avoid problems with different
endianness.
- don't touch things like varlena len that have different values on
different machines.
And if some test turned out to be not portable we can drop it, but at
least we would know that the code works, so it also would not be a
useless effort.

> Copyright year in all new files should be 2025.

Fixed.

>
> Some documentation about brin_index_check() would be handy, especially about its parameters. Perhaps, somewhere near gin_index_check() in amcheck.sgml.

Thanks, I'm gonna do it soon.

>
> brin_check_ereport() reports ERRCODE_DATA_CORRUPTED. We should distinguish between ERRCODE_INDEX_CORRUPTED which is a stormbringer and ERRCODE_DATA_CORRUPTED which is an actual disaster.

Fixed. Interesting, I used btree check as reference when started
writing brin check, and in btree check there 53
ERRCODE_INDEX_CORRUPTED ereports and only 1 ERRCODE_DATA_CORRUPTED
ereport. So it was very hard to do, but I managed to pick the wrong
one. I wonder if this btree check ereport should also be changed to
ERRCODE_INDEX_CORRUPTED?

> If it's not very difficult - it would be great to use read_stream infrastructure. See btvacuumscan() in nbtree.c calling read_stream_begin_relation() for example. We cannot use it in logical scans in B-tree\GiST\GIN, but maybe BRIN can take some advantage of this new shiny technology.

Thanks, I will look into it.

>
> + state->revmapbuf = ReadBufferExtended(idxrel, MAIN_FORKNUM, state->revmapBlk, RBM_NORMAL,
> + state->checkstrategy);
> + LockBuffer(state->revmapbuf, BUFFER_LOCK_SHARE);
> // usage of state->revmapbuf
> + LockBuffer(state->revmapbuf, BUFFER_LOCK_UNLOCK);
> // more usage of state->revmapbuf
> + ReleaseBuffer(state->revmapbuf);
>
> I hope you know what you are doing. BRIN concurrency is not known to me at all.
>

It seems alright, here we lock the revmap page again inside
check_revmap_item() for every revmap item.

>
> That's all for first pass through patches. Thanks for working on it!

Thank you for the review!

Here is a new version of the patch.
It's rebased and It fixes points from Andrey's review. Two tests fail
on the 32 bit build, a new version should fix it. Also the
'all_heap_consistent' part was moved to a separate patch, it's about
400 lines of code, so I hope it is more reviewable now.
I tried the patch with postgis extension brin op classes and found a
small bug, the new version fixes it too.

Best regards,
Arseniy Mukhin

Attachment Content-Type Size
v3-0001-brin-refactoring.patch text/x-patch 3.0 KB
v3-0002-amcheck-brin_index_check-index-structure-check.patch text/x-patch 45.5 KB
v3-0003-amcheck-brin_index_check-all-heap-consistent.patch text/x-patch 28.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Álvaro Herrera 2025-06-18 08:53:21 Re: pg_dump misses comments on NOT NULL constraints
Previous Message Frédéric Yhuel 2025-06-18 08:30:57 Re: [BUG] temporary file usage report with extended protocol and unnamed portals