Re: Reducing connection overhead in pg_upgrade compat check phase

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Daniel Gustafsson <daniel(at)yesql(dot)se>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Reducing connection overhead in pg_upgrade compat check phase
Date: 2023-02-22 19:20:06
Message-ID: 20230222192006.GA79448@nathanxps13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 22, 2023 at 10:37:35AM +0100, Daniel Gustafsson wrote:
>> On 18 Feb 2023, at 06:46, Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
>> The last 4 are for supported versions and, from a very
>> quick glance, seem possible to consolidate. That would bring us to a total
>> of 11 separate loops that we could consolidate into one. However, the data
>> type checks seem to follow a nice pattern, so perhaps this is easier said
>> than done.
>
> There is that, refactoring the data type checks leads to removal of duplicated
> code and a slight performance improvement. Refactoring the other checks to
> reduce overhead would be an interesting thing to look at, but this point in the
> v16 cycle might not be ideal for that.

Makes sense.

>> I wonder if it'd be better to perform all of the data type
>> checks in all databases before failing so that all of the violations are
>> reported. Else, users would have to run pg_upgrade, fix a violation, run
>> pg_upgrade again, fix another one, etc.
>
> I think that's better, and have changed the patch to do it that way.

Thanks. This seems to work as intended. One thing I noticed is that the
"failed check" log is only printed once, even if multiple data type checks
failed. I believe this is because this message uses PG_STATUS. If I
change it to PG_REPORT, all of the "failed check" messages appear. TBH I'm
not sure we need this message at all since a more detailed explanation will
be printed afterwards. If we do keep it around, I think it should be
indented so that it looks more like this:

Checking for data type usage checking all databases
failed check: incompatible aclitem data type in user tables
failed check: reg* data types in user tables

> One change this brings is that check.c contains version specific checks in the
> struct. Previously these were mostly contained in version.c (some, like the
> 9.4 jsonb check was in check.c) which maintained some level of separation.
> Splitting the array init is of course one option but it also seems a tad messy.
> Not sure what's best, but for now I've documented it in the array comment at
> least.

Hm. We could move check_for_aclitem_data_type_usage() and
check_for_jsonb_9_4_usage() to version.c since those are only used for
determining whether the check applies now. Otherwise, IMO things are in
roughly the right place. I don't think it's necessary to split the array.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-02-22 19:33:09 Re: meson and sslfiles.mk in src/test/ssl/
Previous Message Mark Dilger 2023-02-22 19:12:05 Re: Non-superuser subscription owners