Re: Collation versioning

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Robert Haas <robertmhaas(at)gmail(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Douglas Doole <dougdoole(at)gmail(dot)com>, Christoph Berg <myon(at)debian(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Collation versioning
Date: 2020-10-25 02:35:23
Message-ID: CA+hUKGLnLVQyi=6koWHSc+qnoDwvV8SfTXgO6VdJwJ7vCHtLUA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Oct 23, 2020 at 2:07 AM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
> While reviewing the changes, I found a couple of minor issues
> (inherited from the previous versions). It's missing a
> free_objects_addresses in recordDependencyOnCollations, and there's a
> small typo. Inline diff:

Thanks, fixed.

I spent the past few days trying to simplify this patch further.
Here's what I came up with:

1. Drop the function dependencyExists() and related code, which in
earlier versions tried to avoid creating duplicate pg_depends rows by
merging with existing rows. This was rather complicated, and there
are not many duplicates anyway, and it's easier to suppress duplicate
warnings at warning time (see do_collation_version_check()). (I'm not
against working harder to make pg_depend rows unique, but it's not
required for this and I didn't like that way of doing it.)

2. Use index_update_collation_versions() as the workhorse for
REINDEX, binary_upgrade_set_index_coll_version() and ALTER INDEX ...
ALTER COLLATION ... REFRESH_VERSION, instead of having multiple
functions doing similar things. (I wondered about changing the
REINDEX case to simply blow away all dependencies and re-do the
analysis from scratch, which might be useful for future applications
of refobjversion where the set of depended-on objects might change,
but that's out of scope for now and could be added easily enough.)

3. Likewise, drop the function getDependencyVersion() and the
modifications to changeDependenciesOf(); that was yet more catalog
manipulation stuff, but I couldn't see why we don't just call
index_update_collation_versions(newIndexId) after the CREATE INDEX
CONCURRENTLY switcheroo step (see index_concurrently_swap()).

4. Drop track_version from find_expr_references_context, and also
drop the pre-existing code that skipped default collations. I think
it's easier to just let find_expr_references() collect everything, and
leave it to later code to worry about whether to suppress "system
pinned" stuff. It reduces the amount of code that has to know about
refobjversion.

5. General code tidying, pgindent, wordsmithing etc.

I'm not sure what I think about recordMultipleDependencies() being the
function that knows how to fetch the version but I'm not sure if it's
worth the refactoring effort to make ObjectAddresses (essentially a
stretchy vector type) able to carry versions so we can pass that stuff
in.

Attachment Content-Type Size
v32-0001-Remove-pg_collation.collversion.patch text/x-patch 27.7 KB
v32-0002-Add-pg_depend.refobjversion.patch text/x-patch 12.2 KB
v32-0003-Track-collation-versions-for-indexes.patch text/x-patch 115.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2020-10-25 05:18:36 Re: pgstat_report_activity() and parallel CREATE INDEX (was: Parallel index creation & pg_stat_activity)
Previous Message Hou, Zhijie 2020-10-25 01:22:55 Fix typo in src/backend/utils/cache/lsyscache.c