Re: Commits 8de72b and 5457a1 (COPY FREEZE)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Jeff Davis <pgsql(at)j-davis(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Commits 8de72b and 5457a1 (COPY FREEZE)
Date: 2012-12-11 01:04:55
Message-ID: CA+TgmobQ7g5rYGs3DNFLGxyo2hnCzg9FGkrMcKwZWF8brLRo0A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Dec 9, 2012 at 3:06 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> I favor[1] unconditionally letting older snapshots see the new rows after the
> CREATE+COPY transaction commits. To recap, making affected scans see an empty
> table is as wrong as making them see those rows. Robert also listed[2] that
> as a credible option, and I don't recall anyone opining against it in previous
> discussions. I did perceive an undercurrent preference, all other things
> being equal, for an optimization free from semantic side-effects. I shared
> that preference, but investigations showed that we must compromise something.

You know, I hadn't been taking that option terribly seriously, but
maybe we ought to reconsider it. It would certainly be simpler, and
as you point out, it's not really any worse from an MVCC point of view
than anything else we do. Moreover, it would make this available to
clients like pg_dump without further hackery.

I think the current behavior, where we treat FREEZE as a hint, is just
awful. Regardless of whether the behavior is automatic or manually
requested, the idea that you might get the optimization or not
depending on the timing of relcache flushes seems very much
undesirable. I mean, if the optimization is actually important for
performance, then you want to get it when you ask for it. If it
isn't, then why bother having it at all? Let's say that COPY FREEZE
normally doubles performance on a data load that therefore takes 8
hours - somebody who suddenly loses that benefit because of a relcache
flush that they can't prevent or control and ends up with a 16 hour
data load is going to pop a gasket.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-12-11 01:12:28 Re: [v9.3] OAT_POST_ALTER object access hooks
Previous Message Michael Paquier 2012-12-11 00:47:30 Re: Support for REINDEX CONCURRENTLY