Re: REINDEX backend filtering

From: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: REINDEX backend filtering
Date: 2021-03-15 18:32:41
Message-ID: E911A80F-3CFB-4B94-8CBD-FFA0E6DA4FC1@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Mar 15, 2021, at 11:10 AM, Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
>
> On Mon, Mar 15, 2021 at 10:56:50AM -0700, Mark Dilger wrote:
>>
>>
>>> On Mar 15, 2021, at 10:50 AM, Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
>>>
>>> On Mon, Mar 15, 2021 at 10:40:25AM -0700, Mark Dilger wrote:
>>>> I'm saying that your patch seems to call down to get_collation_actual_version() via get_collation_version_for_oid() from your new function do_check_index_has_outdated_collation(), but I'm not seeing how that gets exercised.
>>>
>>> It's a little bit late here so sorry if I'm missing something.
>>>
>>> do_check_index_has_outdated_collation() is called from
>>> index_has_outdated_collation() which is called from
>>> index_has_outdated_dependency() which is called from
>>> RelationGetIndexListFiltered(), and that function is called when passing the
>>> OUTDATED option to REINDEX (and reindexdb --outdated). So this is exercised
>>> with added tests for both matching and non matching collation version.
>>
>> Ok, fair enough. I was thinking about the case where the collation actually returns a different version number because it (the C library providing the collation) got updated, but I think you've answered already that you are not planning to test that case, only the case where pg_depend is modified to have a bogus version number.
>
> This infrastructure is supposed to detect that the collation library *used to*
> return a different version before it was updated. And that's exactly what
> we're testing by manually updating the refobjversion.
>
>> It seems a bit odd to me that a feature intended to handle cases where collations are updated is not tested via having a collation be updated during the test. It leaves open the possibility that something differs between the test and reindexed run after real world collation updates. But that's for the committer who picks up your patch to decide, and perhaps it is unfair to make your patch depend on addressing that issue.
>
> Why is that odd? We're testing that we're correctly storing the collation
> version during index creating and correctly detecting a mismatch. Having a
> fake collation provider to return a fake version number won't add any more
> coverage unless I'm missing something.
>
> It's similar to how we test the various corruption scenario. AFAIK we're not
> providing custom drivers to write corrupted data but we're simply simulating a
> corruption overwriting some blocks.

We do test corrupt relations. We intentionally corrupt the pages within corrupted heap tables to check that they get reported as corrupt. (See src/bin/pg_amcheck/t/004_verify_heapam.pl) Admittedly, the corruptions used in the tests are not necessarily representative of corruptions that might occur in the wild, but that is a hard problem to solve, since we don't know the statistical distribution of corruptions in the wild.

If you had a real, not fake, collation provider which actually provided a collation with an actual version number, stopped the server, changed the behavior of the collation as well as its version number, started the server, and ran REINDEX (OUTDATED), I think that would be a more real-world test. I'm not demanding that you write such a test. I'm just saying that it is strange that we don't have coverage for this anywhere, and was asking if you think there is such coverage, because, you know, maybe I just didn't see where that test was lurking.


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Dilger 2021-03-15 18:38:35 Re: pg_amcheck contrib application
Previous Message Avinash Kumar 2021-03-15 18:21:33 Re: Postgres crashes at memcopy() after upgrade to PG 13.