Re: Materialized view assertion failure in HEAD

From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Joachim Wieland <joe(at)mcknight(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Materialized view assertion failure in HEAD
Date: 2013-03-18 18:39:50
Message-ID: 1363631990.30248.YahooMailNeo@web162905.mail.bf1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I wrote:

>> BTW, is there a really solid reason why a matview couldn't be
>> allowed to have OIDs on demand, and thereby dodge this whole
>> problem? I'm thinking that the analogy to regular views not
>> having OIDs is not a very good argument, because certainly
>> matview rows are going to need all the other system columns.
>
> Some experimentation says that actually it works just fine to
> create matviews with OIDs; the only part of the backend that has
> a problem with that is REFRESH MATERIALIZED VIEW, which is easily
> fixed as per attached.

Sure that works.  Initial versions of the patch did it that way,
and the paint is not as dry in the "no OIDS" area because it is a
recent change to the patch which I haven't (yet) tested as
thoroughly.  What happened was this: in his review Noah asked why
WITH OIDS and WITHOUT OIDS were supported, meaning (as I later
found out) that he questioned support for the *old syntax* without
the parentheses.  I misunderstood and thought he was questioning
whether OIDs were actually useful in materialized views, and came
to the conclusion that they were not.

An oid column in a materialized view will not be significantly more
stable than its ctid.  The same "logical" row could easily have a
different OID on a REFRESH or even an incremental update (due to
the probable need to handle some incremental updates by deleting
and re-inserting portions of the MV).  Allowing an "Object ID" to
be generated when it is not as stable as the name or its usage in a
table might suggest seems likely to lead to all kinds of trouble
and false bug reports down the road.

That said, there is no doubt that it would be the easiest thing to
do; but once that genie is out of the bottle, do we take it away
later, or deal with the headaches forever?  Neither seems appealing
to me.

> If we go in this direction it would probably also be a good idea
> to revert the hack in psql/describe.c to prevent it from printing
> Has OIDs for matviews.

Sure, although I fail to see what that "if" any more of a hack than
100 others in that area of the code.

> Also, the fact that Joachim saw this problem in a dump/reload
> implies that pg_dump is failing to preserve the state of
> relhasoids for matviews, because tvmm hasn't got OIDs in the
> regression database, but his stack trace says it did after dump
> and reload.  I haven't looked into how come, but whichever way we
> jump as far as the backend behavior, pg_dump is clearly confused.

Yeah, that was the bug I was fixing here.  The MV was created
between two SETs of default_with_oids based on table names with
different settings of this.  If we change back to allowing OIDs for
MVs we need to reinstate the test for whether to omit SETs for
default_with_oids for them instead of having an assert that they
don't have OIDs.

> Anyway, I think it makes more sense to go in the direction of
> making the case work than to introduce messy kluges to prevent
> matviews from having OIDs.

If you can come up with a single use case where having OIDs for
them makes sense and is useful, rather than being a loaded foot-gun
for users, it would go a long way toward convincing me that is the
way to go.  Granted, my first go at switching away from that didn't
touch all the bases, but at this point it's probably at least as
much of a change to revert to supporting OIDs as to fix the
omissions.  I will also grant that I haven't been able to see a
really pretty way to do it, but that is mostly because the whole
area of handling this is full of grotty special cases which need to
be handled already.

> Also ... while I was nosing around I noticed this bit in
> InitPlan():
>  
>       /*
> +     * Unless we are creating a view or are creating a materialized view WITH
> +     * NO DATA, ensure that all referenced relations are scannable.
> +     */
> +     if ((eflags & EXEC_FLAG_WITH_NO_DATA) == 0)
> +         ExecCheckRelationsScannable(rangeTable);
>
> ISTM that suppressing this check during a matview refresh is
> rather broken, because then it won't complain if the matview
> reads some other matview that's in a nonscannable state.  Is that
> really the behavior we want?

It isn't, nor is it the behavior we get:

test=# create materialized view x as select 1;
SELECT 1
test=# create materialized view y as select * from x;
SELECT 1
test=# refresh materialized view x with no data;
REFRESH MATERIALIZED VIEW
test=# refresh materialized view y;
ERROR:  materialized view "x" has not been populated
HINT:  Use the REFRESH MATERIALIZED VIEW command.

If that is what you got out of the comment, the comment probably
needs some work.

> (Scanning the rangetable is probably wrong for this anyway
> --- it'd be better to put the responsibility into the
> table-scan-node initialization functions.)

Don't we want to lock the appropriate relations at the point where
I have this?  Would it be right to wait until the point you suggest
from the locking POV?

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2013-03-18 18:42:23 Re: Enabling Checksums
Previous Message Josh Berkus 2013-03-18 18:28:24 Re: Enabling Checksums