Re: Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE

From: Noah Misch <noah(at)leadboat(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE
Date: 2011-06-28 19:40:39
Message-ID: 20110628194039.GB10430@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 28, 2011 at 02:11:11PM -0400, Robert Haas wrote:
> On Mon, Jun 27, 2011 at 10:43 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> > On Mon, Jun 27, 2011 at 03:45:43PM -0400, Robert Haas wrote:
> >> On Wed, Jun 15, 2011 at 1:03 AM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> >> > [patch to avoid index rebuilds]
> >>
> >> With respect to the documentation hunks, it seems to me that the first
> >> hunk might be made clearer by leaving the paragraph of which it is a
> >> part as-is, and adding another paragraph afterwards beginning with the
> >> words "In addition".
> >
> > The added restriction elaborates on the transitivity requirement, so I wanted to
> > keep the new language adjacent to that.
>
> That makes sense, but it reads a bit awkwardly to me. Maybe it's just
> that the sentence itself is so complicated that I have difficulty
> understanding it. I guess it's the same problem as with the text you
> added about hash indexes: without thinking about it, it's hard to
> understand what territory is covered by the new sentence that is not
> covered by what's already there. In the case of the hash indexes, I
> think I have it figured out: we already know that we must get
> compatible hash values out of logically equal values of different
> datatypes. But it's possible that the inter-type cast changes the
> value in some arbitrary way and then compensates for it by defining
> the hash function in such a way as to compensate. Similarly, for
> btrees, we need the relative ordering of A and B to remain the same
> after casting within the opfamily, not to be rearranged somehow.
> Maybe the text you've got is OK to explain this, but I wonder if
> there's a way to do it more simply.

That's basically right, I think. Presently, there is no connection between
casts and operator family notions of equality. For example, a cast can change
the hash value. In general, that's not wrong. However, I wish to forbid it
when some hash operator family covers both the source and destination types of
the cast. Note that, I don't especially care whether the stored bits changed or
not. I just want casts to preserve equality when an operator family defines
equality across the types involved in the cast. The specific way of
articulating that is probably vulnerable to improvement.

> > It would be valuable to avoid introducing a second chunk of code that knows
> > everything about the catalog entries behind an index. ?That's what led me to the
> > put forward the most recent version as best. ?What do you find vile about that
> > approach? ?I wasn't comfortable with it at first, because I suspected the checks
> > in RelationPreserveStorage() might be important for correctness. ?Having studied
> > it some more, though, I think they just reflect the narrower needs of its
> > current sole user.
>
> Maybe vile is a strong word, but it seems like a modularity violation:
> we're basically letting DefineIndex() do some stuff we don't really
> want done, and then backing it out parts of it that we don't really
> want done after all. It seems like it would be better to provide
> DefineIndex with enough information not to do the wrong thing in the
> first place. Could we maybe pass stmt->oldNode to DefineIndex(), and
> let it work out the details?

True. I initially shied away from that, because we assume somewhat deep into
the stack that the new relation will have pg_class.oid = pg_class.relfilenode.
Here's the call stack in question:

RelationBuildLocalRelation
heap_create
index_create
DefineIndex
ATExecAddIndex

Looking at it again, it wouldn't bring the end of the world to add a relfilenode
argument to each. None of those have more than four callers. ATExecAddIndex()
would then call RelationPreserveStorage() before calling DefineIndex(), which
would in turn put things in a correct state from the start. Does that seem
appropriate? Offhand, I do like it better than what I had.

Thanks,
nm

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dave Page 2011-06-28 19:48:18 Re: marking old branches as no longer maintained
Previous Message Andrew Dunstan 2011-06-28 19:37:07 Re: marking old branches as no longer maintained