Re: Materialized views WIP patch

From: Noah Misch <noah(at)leadboat(dot)com>
To: Kevin Grittner <kgrittn(at)mail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Marko Tiikkaja <pgmail(at)joh(dot)to>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Materialized views WIP patch
Date: 2013-01-24 18:07:54
Message-ID: 20130124180754.GB2448@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Thu, Jan 17, 2013 at 07:54:55AM -0500, Robert Haas wrote:
> On Wed, Jan 16, 2013 at 1:38 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> Where I really need someone to hit me upside the head with a
> >> clue-stick is the code I added to the bottom of RelationBuildDesc()
> >> in relcache.c. The idea is that on first access to an unlogged MV,
> >> to detect that the heap has been replaced by the init fork, set
> >> relisvalid to false, and make the heap look normal again.
> >
> > Hmm. I agree that relcache.c has absolutely no business doing that,
> > but not sure what else to do instead. Seems like it might be better
> > done at first touch of the MV in the parser, rewriter, or planner ---
> > but the fact that I can't immediately decide which of those is right
> > makes me feel that it's still too squishy.
>
> I think we shouldn't be doing that at all. The whole business of
> transferring the relation-is-invalid information from the relation to
> a pg_class flag seems like a Rube Goldberg device to me. I'm still
> not convinced that we should have a relation-is-invalid flag at all,
> but can we at least not have two?
>
> It seems perfectly adequate to detect that the MV is invalid when we
> actually try to execute a plan - that is, when we first access the
> heap or one of its indexes. So the bit can just live in the
> file-on-disk, and there's no need to have a second copy of it in
> pg_class.

Like Kevin, I want a way to distinguish unpopulated MVs from MVs that
genuinely yielded the empty set at last refresh. I agree that there's no
particular need to store that fact in pg_class, and I would much prefer only
storing it one way in any case. A user-visible disadvantage of the current
implementation is that relisvalid remains stale until something opens the rel.
That's fine for the system itself, but it can deceive user-initiated catalog
queries. Imagine a check_postgres action that looks for invalid MVs to
complain about. It couldn't just scan pg_class; it would need to first do
something that opens every MV.

I suggest the following:

1. Let an invalid MV have a zero-length heap. Distinguish a valid, empty MV
by giving it a page with no tuples. This entails VACUUM[1] not truncating
MVs below one page and the refresh operation, where necessary, extending
the relation from zero pages to one.
2. Remove pg_class.relisvalid.
3. Add a bool field to RelationData. The word "valid" is used in that context
to refer to the validity of the structure itself, so perhaps call the new
field rd_scannable. RelationIsFlaggedAsValid() can become a macro;
consider changing its name for consistency with the field name.
4. During relcache build, set the field to "RelationGetNumberBlocks(rel) != 0"
for MVs, fixed "true" for everyone else. Any operation that changes the
field must, and probably would anyway, instigate a relcache invalidation.
5. Expose a database function, say pg_relation_scannable(), for querying the
current state of a relation. This supports user-level monitoring.

Does that seem reasonable? One semantic difference to keep in mind is that
unlogged MVs will be considered invalid on the standby while valid on the
master. That's essentially an accurate report, so I won't mind it.

For the benefit of the archives, I note that we almost need not truncate an
unlogged materialized view during crash recovery. MVs are refreshed in a
VACUUM FULL-like manner: fill a new relfilenode, fsync it, and point the MV's
pg_class to that relfilenode. When a crash occurs with no refresh in flight,
the latest refresh had been safely synced. When a crash cuts short a refresh,
the pg_class update will not stick, and the durability of the old heap is not
in doubt. However, non-btree index builds don't have the same property; we
would need to force an immediate sync of the indexes to be safe here. It
would remain necessary to truncate unlogged MVs when recovering a base backup,
which may contain a partially-written refresh that did eventually commit.
Future MV variants that modify the MV in place would also need the usual
truncate on crash.

I'm going to follow this with a review covering a broader range of topics.

Thanks,
nm

[1] For the time being, it's unfortunate to VACUUM materialized views at all;
they only ever bear frozen tuples.

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Alvaro Herrera 2013-01-24 20:19:15 pgsql: Redefine HEAP_XMAX_IS_LOCKED_ONLY
Previous Message Alvaro Herrera 2013-01-24 16:57:08 Re: [COMMITTERS] pgsql: Improve concurrency of foreign key locking

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2013-01-24 18:09:28 Re: Materialized views WIP patch
Previous Message David Fetter 2013-01-24 17:51:46 LATERAL, UNNEST and spec compliance