From: | Julien Rouhaud <rjuju123(at)gmail(dot)com> |
---|---|
To: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Kevin Grittner <kgrittn(at)gmail(dot)com> |
Subject: | Re: Add parallelism and glibc dependent only options to reindexdb |
Date: | 2019-07-01 16:14:20 |
Message-ID: | CAOBaU_bg3VheGYkjjvPd6Buw2Uk7yqAAGwXy7BXm6DwytWAdwA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jul 1, 2019 at 3:51 PM Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
>
> Please don't reuse a file name as generic as "parallel.c" -- it's
> annoying when navigating source. Maybe conn_parallel.c multiconn.c
> connscripts.c admconnection.c ...?
I could use scripts_parallel.[ch] as I've already used it in the #define part?
> If your server crashes or is stopped midway during the reindex, you
> would have to start again from scratch, and it's tedious (if it's
> possible at all) to determine which indexes were missed. I think it
> would be useful to have a two-phase mode: in the initial phase reindexdb
> computes the list of indexes to be reindexed and saves them into a work
> table somewhere. In the second phase, it reads indexes from that table
> and processes them, marking them as done in the work table. If the
> second phase crashes or is stopped, it can be restarted and consults the
> work table. I would keep the work table, as it provides a bit of an
> audit trail. It may be important to be able to run even if unable to
> create such a work table (because of the <ironic>numerous</> users that
> DROP DATABASE postgres).
Or we could create a table locally in each database, that would fix
this problem and probably make the code simpler?
It also raises some additional concerns about data expiration. I
guess that someone could launch the tool by mistake, kill reindexdb,
and run it again 2 months later while a lot of new objects have been
added for instance.
> The "glibc filter" thing (which I take to mean "indexes that depend on
> collations") would apply to the first phase: it just skips adding other
> indexes to the work table. I suppose ICU collations are not affected,
> so the filter would be for glibc collations only?
Indeed, ICU shouldn't need such a filter. xxx_pattern_ops based
indexes are also excluded.
> The --glibc-dependent
> switch seems too ad-hoc. Maybe "--exclude-rule=glibc"? That way we can
> add other rules later. (Not "--exclude=foo" because we'll want to add
> the possibility to ignore specific indexes by name.)
That's a good point, I like the --exclude-rule switch.
From | Date | Subject | |
---|---|---|---|
Next Message | Julien Rouhaud | 2019-07-01 16:28:13 | Re: Add parallelism and glibc dependent only options to reindexdb |
Previous Message | Tom Lane | 2019-07-01 16:13:11 | Re: Cleanup/remove/update references to OID column |