Re: pg_dump compatibility level / use create view instead of create table/rule

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alex Williams <valenceshell(at)protonmail(dot)com>
Cc: "pgsql-sql(at)postgresql(dot)org" <pgsql-sql(at)postgresql(dot)org>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: pg_dump compatibility level / use create view instead of create table/rule
Date: 2019-10-10 15:20:14
Message-ID: 1095.1570720814@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-sql

Alex Williams <valenceshell(at)protonmail(dot)com> writes:
> [ gripes about pg_dump printing REPLICA IDENTITY NOTHING for a view ]

I spent a little bit of time trying to reproduce this, and indeed I can,
in versions before v10.

regression=# create table mytab (f1 int primary key, f2 text);
CREATE TABLE
regression=# create view myview as select * from mytab group by f1;
CREATE VIEW

This situation is problematic for pg_dump because validity of the
view depends on the existence of mytab's primary key constraint,
and we don't create primary keys till late in the restore process.
So it has to break myview into two parts, one to emit during normal
table/view creation and one to emit after index creation.

With 9.5's pg_dump, what comes out is:

--
-- Name: myview; Type: TABLE; Schema: public; Owner: postgres
--

CREATE TABLE public.myview (
f1 integer,
f2 text
);

ALTER TABLE ONLY public.myview REPLICA IDENTITY NOTHING;

ALTER TABLE public.myview OWNER TO postgres;

and then later:

--
-- Name: myview _RETURN; Type: RULE; Schema: public; Owner: postgres
--

CREATE RULE "_RETURN" AS
ON SELECT TO public.myview DO INSTEAD SELECT mytab.f1,
mytab.f2
FROM public.mytab
GROUP BY mytab.f1;

The reason we get "REPLICA IDENTITY NOTHING" is that a view's relreplident
is set to 'n' not 'd', which might not have been a great choice. But why
does pg_dump print anything --- it knows perfectly well that it should not
emit REPLICA IDENTITY for relkinds that don't have storage? The answer
emerges from looking at the code that breaks the dependency loop:

/* pretend view is a plain table and dump it that way */
viewinfo->relkind = 'r'; /* RELKIND_RELATION */

After that, pg_dump *doesn't* know it's a view, which also explains
why the comment says TABLE not VIEW.

This is fixed in v10 and up thanks to d8c05aff5. I was hesitant to
back-patch that at the time, but now that it's survived in the field
for a couple years, I think a good case could be made for doing so.
After a bit of looking around, the main argument I can find against
it is that emitting 'CREATE OR REPLACE VIEW' in a dropStmt will
break pg_restore versions preceding this commit:

Author: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Branch: master Release: REL_10_BR [ac888986f] 2016-11-17 14:59:13 -0500
Branch: REL9_6_STABLE Release: REL9_6_2 [0eaa5118a] 2016-11-17 14:59:19 -0500
Branch: REL9_5_STABLE Release: REL9_5_6 [a7864037d] 2016-11-17 14:59:23 -0500
Branch: REL9_4_STABLE Release: REL9_4_11 [e69b532be] 2016-11-17 14:59:26 -0500

Improve pg_dump/pg_restore --create --if-exists logic.

Teach it not to complain if the dropStmt attached to an archive entry
is actually spelled CREATE OR REPLACE VIEW, since that will happen due to
an upcoming bug fix. Also, if it doesn't recognize a dropStmt, have it
print a WARNING and then emit the dropStmt unmodified. That seems like a
much saner behavior than Assert'ing or dumping core due to a null-pointer
dereference, which is what would happen before :-(.

Back-patch to 9.4 where this option was introduced.

AFAIR, we have not had complaints about back-rev pg_restore failing
on archives made by v10 and up; but perhaps it's more likely that
someone would try to use, say, 9.5.5 pg_restore with a dump made by
9.5.20 pg_dump.

An alternative that just responds to Alex's issue without fixing the
other problems d8c05aff5 fixed is to hack the dependency-loop code
like this:

/* pretend view is a plain table and dump it that way */
viewinfo->relkind = 'r'; /* RELKIND_RELATION */
viewinfo->relkind = 'r'; /* RELKIND_RELATION */
+ viewinfo->relreplident = 'd'; /* REPLICA_IDENTITY_DEFAULT */

That's mighty ugly but it doesn't seem to carry any particular
risk.

Thoughts?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-10-10 15:45:46 Re: let's make the list of reportable GUCs configurable (was Re: Add %r substitution for psql prompts to show recovery status)
Previous Message Stephen Frost 2019-10-10 14:40:37 Re: Transparent Data Encryption (TDE) and encrypted files

Browse pgsql-sql by date

  From Date Subject
Next Message Steve Midgley 2019-10-11 09:20:27 Modeling graph data in Postgres?
Previous Message Alex Williams 2019-10-10 00:24:12 Re: pg_dump compatibility level / use create view instead of create table/rule