Re: Parallel CREATE INDEX for BRIN indexes

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Alexander LAW <1(at)1o(dot)ru>, Andres Freund <andres(at)anarazel(dot)de>
Cc: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Parallel CREATE INDEX for BRIN indexes
Date: 2024-04-15 08:18:46
Message-ID: 4deb85e5-d48f-4935-ae89-0742e0aaa04b@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 4/15/24 08:00, Alexander LAW wrote:
> Hello Tomas,
>
> 14.04.2024 20:09, Tomas Vondra wrote:
>> I've pushed this, including backpatching the two fixes. I've reduced the
>> amount of data needed by the test, and made sure it works on 32-bits too
>> (I was a bit worried it might be sensitive to that, but that seems not
>> to be the case).
>
> I've discovered that that test addition brings some instability to the
> test.
> With the following pageinspect/Makefile modification:
> -REGRESS = page btree brin gin gist hash checksum oldextversions
> +REGRESS = page btree brin $(shell printf 'brin %.0s' `seq 99`) gin gist
> hash checksum oldextversions
>
> echo "autovacuum_naptime = 1" > /tmp/temp.config
> TEMP_CONFIG=/tmp/temp.config make -s check -C contrib/pageinspect
> fails for me as below:
> ...
> ok 17        - brin                                      127 ms
> not ok 18    - brin                                      140 ms
> ok 19        - brin                                      125 ms
> ...
> # 4 of 107 tests failed.
>
> The following change
> -CREATE TABLE brin_parallel_test (a int, b text, c bigint) WITH
> (fillfactor=40);
> +CREATE TEMP TABLE brin_parallel_test (a int, b text, c bigint) WITH
> (fillfactor=40);
> (similar to e2933a6e1) makes the test pass reliably for me.
>

Thanks! This reproduces the issue for me.

I believe this happens because the test does "DELETE + VACUUM" to
generate "gaps" in the table, to get empty ranges in the BRIN. I guess
what's happening is that something (autovacuum or likely something else)
blocks the explicit VACUUM from cleaning some of the pages with deleted
tuples, but then the cleanup happens shortly after between building the
the serial/parallel indexes. That would explain the differences reported
by the regression test.

When I thought about this while writing the test, my reasoning was that
even if the explicit vacuum occasionally fails to clean something, it
should affect all the indexes equally. Which is why I wrote the test to
compare the results using EXCEPT, not checking the exact output.

I'm not a huge fan of temporary tables in regression tests, because it
disappears at the end, making it impossible to inspect the data after a
failure. But the only other option I could think of is disabling
autovacuum on the table, but that does not seem to prevent the failures.

I'll try a bit more to make this work without the temp table.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nazir Bilal Yavuz 2024-04-15 08:25:05 Re: gcc 12.1.0 warning
Previous Message Alexander Lakhin 2024-04-15 08:00:00 Re: Parallel CREATE INDEX for BRIN indexes