Re: Reducing connection overhead in pg_upgrade compat check phase

From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Reducing connection overhead in pg_upgrade compat check phase
Date: 2023-07-10 14:43:23
Message-ID: A31FC972-FCBE-47CE-85F0-182B58C29422@yesql.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 8 Jul 2023, at 23:43, Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:

Thanks for reviewing!

> Since "script" will be NULL and "db_used" will be false in the first
> iteration of the loop, couldn't we move this stuff to before the loop?

We could. It's done this way to match how all the other checks are performing
the inner loop for consistency. I think being consistent is better than micro
optimizing in non-hot codepaths even though it adds some redundancy.

> nitpick: І think the current code has two spaces at the beginning of this
> format string. Did you mean to remove one of them?

Nice catch, I did not. Fixed.

> Won't "script" always be initialized here? If I'm following this code
> correctly, I think everything except the fclose() can be removed.

You are right that this check is superfluous. This is again an artifact of
modelling the code around how the other checks work for consistency. At least
I think that's a good characteristic of the code.

> I think this is unnecessary since we assign "cur_check" at the beginning of
> every loop iteration. I see two of these.

Right, this is a pointless leftover from a previous version which used a
while() loop, and I had missed removing them. Fixed.

> +static int n_data_types_usage_checks = 7;
>
> Can we determine this programmatically so that folks don't need to remember
> to update it?

Fair point, I've added a counter loop to the beginning of the check function to
calculate it.

> IMHO it's a little strange that this is initialized to all "true", only
> because I think most other Postgres code does the opposite.

Agreed, but it made for a less contrived codepath in knowing when an error has
been seen already, to avoid duplicate error output, so I think it's worth it.

> Do you think we should rename these functions to something like
> "should_check_for_*"? They don't actually do the check, they just tell you
> whether you should based on the version.

I've been pondering that too, and did a rename now along with moving them all
to a single place as well as changing the comments to make it clearer.

> In fact, I wonder if we could just add the versions directly to
> data_types_usage_checks so that we don't need the separate hook functions.

We could, but it would be sort of contrived I think since some check <= and
some == while some check the catversion as well (and new ones may have other
variants. I think this is the least paint-ourselves-in-a-corner version, if we
feel it's needlessly complicated and no other variants are added we can revisit
this.

--
Daniel Gustafsson

Attachment Content-Type Size
v6-0001-pg_upgrade-run-all-data-type-checks-per-connectio.patch application/octet-stream 37.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2023-07-10 14:47:12 Re: remaining sql/json patches
Previous Message Melih Mutlu 2023-07-10 14:41:11 Re: Changing types of block and chunk sizes in memory contexts