pg_dump versus rules, once again

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: pg_dump versus rules, once again
Date: 2016-11-16 19:39:44
Message-ID: 19092.1479325184@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I looked into the problem reported at
https://www.postgresql.org/message-id/b3690957-fd8c-6def-d3ec-e589887dd0f1%40codata.eu

It's easy to reproduce. Given this simple database:

create table tt (f1 int primary key, f2 text);
create view vv as select * from tt group by f1;

pg_dump with the --clean option will generate

DROP RULE "_RETURN" ON public.vv;

which the backend rejects:

ERROR: cannot drop rule _RETURN on view vv because view vv requires it
HINT: You can drop view vv instead.

The reason for this is that because the view is dependent on tt's primary
key constraint (since it omits an otherwise-required "GROUP BY f2"),
pg_dump has a circular dependency to deal with: it wants to create the
view pre-data, but the view definition won't work until after the pkey has
been created post-data.

Our longtime solution to circularities involving views is to break the
view into CREATE TABLE and then CREATE RULE "_RETURN", exploiting a
horribly ancient backwards-compatibility hack in the backend that will
turn an empty table into a view if it gets a command to create an ON
SELECT rule for it. That's fine until you add --clean to the mix, which
causes pg_dump to blindly emit a DROP RULE and then DROP TABLE. Lose.

One way to fix this would be to add code to the backend so that
DROP RULE "_RETURN" converts the view back into a table, but ick.

We've talked before about replacing this _RETURN-rule business with
CREATE OR REPLACE VIEW, ie the idea would be for pg_dump to first emit
a dummy view with the right column names/types, say

CREATE VIEW vv AS SELECT null::int AS f1, null::text AS f2;

and then later when it's safe, emit CREATE OR REPLACE VIEW with the view's
real query. The main point of this according to past discussion would be
to eliminate dump files' dependency on the _RETURN-rule implementation of
views, so that someday in the far future we could change that if we
wished. However, if that were how pg_dump dealt with circular view
dependencies, then it would not take much more code to emit

CREATE OR REPLACE VIEW vv AS SELECT null::int AS f1, null::text AS f2;

as a substitute for the DROP RULE "_RETURN" step in a --clean script.
(Later, after we'd gotten rid of whatever was circularly depending on
that, we would emit DROP VIEW vv.)

So I'm thinking of going and doing this. Any objections?

Although this is a bug fix, I'm not sure whether to consider
back-patching. The case isn't that hard to work around -- either ignore
the error, or change your view to spell out its GROUP BY in full.
But it'd be annoying to hit this during pg_upgrade for instance.

CREATE OR REPLACE VIEW has existed since 7.3, so we're not creating much
of a portability problem at the server end if we make this change.
However, I notice that the kluge that was added to RestoreArchive() for
--if-exists will dump core (Assert failure or null pointer dereference)
if an archived dropStmt isn't what it expects. I think that's broken
anyway, but it'd become actively broken as soon as we start handling views
this way, so we'd need to back-patch at least some change there.
Probably it's sufficient to teach that code to do nothing to statements
it doesn't recognize.

BTW, the ability to create a view that has this hazard has existed since
9.1.

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-11-16 19:45:38 Re: Unlogged tables cleanup
Previous Message Robert Haas 2016-11-16 19:34:56 Re: Remove the comment on the countereffectiveness of large shared_buffers on Windows