Re: pgsql: Move handling of database properties from pg_dumpall into pg_dum

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: pgsql: Move handling of database properties from pg_dumpall into pg_dum
Date: 2018-01-22 21:03:21
Message-ID: 12453.1516655001@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

I wrote:
> Commit 4bd371f6f's hack to emit "SET default_transaction_read_only = off"
> is gone: we now dodge that problem by the expedient of not issuing ALTER
> DATABASE SET commands until after reconnecting to the target database.
> Therefore, such settings won't apply during the restore session.

The buildfarm just pointed out an issue in that: the delayed SET commands
might affect the interpretation of data during the restore session.
Specifically, I see failures like this on machines where the prevailing
locale isn't C or US:

pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 4871; 2618 34337 RULE rtest_emp rtest_emp_del pgbf
pg_restore: [archiver (db)] could not execute query: ERROR: invalid input syntax for type money: "$0.00"
LINE 3: ... ("old"."ename", CURRENT_USER, 'fired'::"bpchar", '$0.00'::"...
^
Command was: CREATE RULE "rtest_emp_del" AS
ON DELETE TO "rtest_emp" DO INSERT INTO "rtest_emplog" ("ename", "who", "action", "newsal", "oldsal")
VALUES ("old"."ename", CURRENT_USER, 'fired'::"bpchar", '$0.00'::"money", "old"."salary");

The dump dumped the money literal with '$' because we set up the
regression database with

ALTER DATABASE regression SET lc_monetary TO 'C';

but at the time we process the CREATE RULE during the restore, we're
still running with whatever the environmental default is.

Not very sure about the best solution here. Clearly, we'd better absorb
these settings to some extent, but how?

I thought for a bit about having pg_dump emit immediate SET operations
along with the ALTER versions, for any GUCs we deem important for
restore purposes. This would probably be about a 99% solution, but
we could only do it for ALTER DATABASE SET not for ALTER ROLE IN DATABASE
SET, because we couldn't be very sure which of the latter should apply.
So it would have some edge-case failure conditions.

Or we could reconnect after applying the ALTERs, ensuring that we have
the expected environment. But then we're back into the problem that
commit 4bd371f6f hoped to solve, namely that "ALTER DATABASE SET
default_transaction_read_only = on" breaks pg_dumpall (and hence
pg_upgrade).

I then thought about splitting the ALTERs into two separate TOC entries,
one for "desirable" GUCs (which we'd apply by reconnecting after emitting
its contents), and one for "undesirables", which we would not reconnect
after. That seemed a bit messy because of the need to maintain a
blacklist going forward, and the possibility that we couldn't agree on
what to blacklist.

Then it occurred to me that none of this works anyway for parallel
pg_restore --- including parallel pg_upgrade --- because the workers
are going to see all the ALTERs in place. And that means that commit
4bd371f6f's hack has been broken since parallel upgrade went in, in 9.3.

So at this point I'm feeling that letting pg_dumpall work around
default_transaction_read_only = on is just too much of a hack, and we
should forget about that consideration entirely. If we do, fixing the
current buildfarm failure is about a one-line patch: just reconnect
after processing DATABASE PROPERTIES.

If someone were to hold a gun to my head and say "make this work", what
I'd probably do is set up the desirable vs undesirable split and then
arrange for the "undesirable" GUCs to be postponed to the end of the
restore, after the parallel portion of the run is complete. But man,
that's a lot of ugliness.

I think the only reason that 4bd371f6f got in at all was that it was just
a two-line kluge, and we were willing to accept that amount of ugliness
to handle a corner case more nicely. The cost of solving it after the
pg_dump/dumpall refactoring will be a lot higher, and I for one don't
think it's worth it.

Comments?

regards, tom lane

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2018-01-22 21:56:07 Re: pgsql: Move handling of database properties from pg_dumpall into pg_dum
Previous Message Tom Lane 2018-01-22 19:47:36 Re: pgsql: Move handling of database properties from pg_dumpall into pg_dum

Browse pgsql-hackers by date

  From Date Subject
Next Message Chapman Flack 2018-01-22 21:04:30 Re: Security leak with trigger functions?
Previous Message Robert Haas 2018-01-22 20:50:16 Re: [HACKERS] Deadlock in XLogInsert at AIX