Quick Links

Re: Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Noah Misch <noah(at)leadboat(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE
Date:	2011-06-29 13:42:06
Message-ID:	BANLkTi=hH3M28poMzy_rVbm-Hmpz_q4ShQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Jun 28, 2011 at 3:40 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> On Tue, Jun 28, 2011 at 02:11:11PM -0400, Robert Haas wrote:
>> On Mon, Jun 27, 2011 at 10:43 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:
>> > On Mon, Jun 27, 2011 at 03:45:43PM -0400, Robert Haas wrote:
>> >> On Wed, Jun 15, 2011 at 1:03 AM, Noah Misch <noah(at)leadboat(dot)com> wrote:
>> >> > [patch to avoid index rebuilds]
>> >>
>> >> With respect to the documentation hunks, it seems to me that the first
>> >> hunk might be made clearer by leaving the paragraph of which it is a
>> >> part as-is, and adding another paragraph afterwards beginning with the
>> >> words "In addition".
>> >
>> > The added restriction elaborates on the transitivity requirement, so I wanted to
>> > keep the new language adjacent to that.
>>
>> That makes sense, but it reads a bit awkwardly to me. Maybe it's just
>> that the sentence itself is so complicated that I have difficulty
>> understanding it. I guess it's the same problem as with the text you
>> added about hash indexes: without thinking about it, it's hard to
>> understand what territory is covered by the new sentence that is not
>> covered by what's already there. In the case of the hash indexes, I
>> think I have it figured out: we already know that we must get
>> compatible hash values out of logically equal values of different
>> datatypes. But it's possible that the inter-type cast changes the
>> value in some arbitrary way and then compensates for it by defining
>> the hash function in such a way as to compensate. Similarly, for
>> btrees, we need the relative ordering of A and B to remain the same
>> after casting within the opfamily, not to be rearranged somehow.
>> Maybe the text you've got is OK to explain this, but I wonder if
>> there's a way to do it more simply.
>
> That's basically right, I think. Presently, there is no connection between
> casts and operator family notions of equality. For example, a cast can change
> the hash value. In general, that's not wrong. However, I wish to forbid it
> when some hash operator family covers both the source and destination types of
> the cast. Note that, I don't especially care whether the stored bits changed or
> not. I just want casts to preserve equality when an operator family defines
> equality across the types involved in the cast. The specific way of
> articulating that is probably vulnerable to improvement.
>
>> > It would be valuable to avoid introducing a second chunk of code that knows
>> > everything about the catalog entries behind an index. ?That's what led me to the
>> > put forward the most recent version as best. ?What do you find vile about that
>> > approach? ?I wasn't comfortable with it at first, because I suspected the checks
>> > in RelationPreserveStorage() might be important for correctness. ?Having studied
>> > it some more, though, I think they just reflect the narrower needs of its
>> > current sole user.
>>
>> Maybe vile is a strong word, but it seems like a modularity violation:
>> we're basically letting DefineIndex() do some stuff we don't really
>> want done, and then backing it out parts of it that we don't really
>> want done after all. It seems like it would be better to provide
>> DefineIndex with enough information not to do the wrong thing in the
>> first place. Could we maybe pass stmt->oldNode to DefineIndex(), and
>> let it work out the details?
>
> True. I initially shied away from that, because we assume somewhat deep into
> the stack that the new relation will have pg_class.oid = pg_class.relfilenode.
> Here's the call stack in question:
>
> RelationBuildLocalRelation
> heap_create
> index_create
> DefineIndex
> ATExecAddIndex
>
> Looking at it again, it wouldn't bring the end of the world to add a relfilenode
> argument to each. None of those have more than four callers.

Yeah. Those functions take an awful lot of arguments, which suggests
that some refactoring might be in order, but I still think it's
cleaner to add another argument than to change the state around
after-the-fact.

> ATExecAddIndex()
> would then call RelationPreserveStorage() before calling DefineIndex(), which
> would in turn put things in a correct state from the start. Does that seem
> appropriate? Offhand, I do like it better than what I had.

I wish we could avoid the whole death-and-resurrection thing
altogether, but off-hand I'm not seeing a real clean way to do that.
At the very least we had better comment it to death.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE at 2011-06-28 19:40:39 from Noah Misch

Responses

Re: Avoid index rebuilds for no-rewrite ALTER TABLE ALTER TYPE at 2011-06-30 15:55:19 from Noah Misch

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Robert Haas	2011-06-29 13:43:57	Re: Process local hint bit cache
Previous Message	Merlin Moncure	2011-06-29 13:39:55	Re: Process local hint bit cache