Re: Remaining beta blockers

From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Remaining beta blockers
Date: 2013-04-30 14:40:19
Message-ID: 1367332819.38842.YahooMailNeo@web162905.mail.bf1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:

>> Let's look at the "corner" this supposedly paints us into.  If a
>> later major release creates a better mechanism, current pg_dump and
>> load will already use it, based on the way matviews are created
>> empty and REFRESHed by pg_dump.  Worst case, we need to modify the
>> behavior of pg_dump running with the switch used by pg_upgrade to
>> use a new ALTER MATERIALIZED VIEW SET (populated); (or whatever
>> syntax is chosen) -- a command we would probably want at that point
>> anyway.  I'm not seeing cause for panic here.
>>
>> What is a real problem or risk with using this mechanism until we
>> engineer something better?  What problems with converting to a
>> later major release does anyone see?
>
> Well, it's a pg_upgrade hazard, if nothing else, isn't it?

I don't think so.  What do you see as a problem?

> Sure, I wouldn't allow that either.  My point is that I feel that
> could be engineered around in user space.  If you have a materialized
> view which could legitimately be empty (there are many for which that
> won't be an issue), then you can either arrange the view definition so
> that it isn't (e.g. by including a dummy record that clients can look
> for), or you can include a sentinel unlogged table that will contain a
> row if and only if materialized views have been refreshed since the
> last crash.  In the examples I can think of,
> indefinitely-stale-but-valid-at-some-point wouldn't be very good
> either, so I would anticipate needing to do some engineering around
> relative freshness anyway - e.g. keeping a side table that lists the
> last-refreshed-time for each matview.  Or, maybe I'd wouldn't care
> about tracking elapsed time per se, but instead want to track
> freshness relative to updates - e.g. set things up so that anyone who
> performs an action that would invalidate a row in the materialized
> view would also update a row someplace that would allow me to identify
> the row as stale.  In either case, handling the case where the view is
> altogether invalid doesn't seem to add a whole lot of additional
> complexity.
>
> Now, I can imagine cases where it does.  For example, suppose you have
> a cron job (which you trust to work) to refresh your materialized
> views every night.  Well, that means that you'll never be more than 24
> hours stale - but if any of those materialized views are unlogged,
> that also means that you could have no data at all for up to 24 hours
> following a crash.  Not great, because now you need some logic to
> handle just that one case that wouldn't be necessary if the DB did it
> for you.  But I just think it's a judgement call how serious one
> thinks that scenario is, vs. any other scenario where a boolean isn't
> adequate either.

"Staleness" is a completely different issue, in my view, from
quietly returning results that are not, and never were, accurate.
Sure we need to implement more refined "scannability" tests than
whether valid data from *some* point in time is present.  But that
should always be *part* of the scannability testing, and without it
I don't feel we have a feature of a quality suitable for delivery.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2013-04-30 14:49:56 Re: The missing pg_get_*def functions
Previous Message Kevin Grittner 2013-04-30 14:33:05 Re: Remaining beta blockers