Re: HOT patch - version 14

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
Cc: PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: HOT patch - version 14
Date: 2007-08-30 15:16:39
Message-ID: 1314.1188486999@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com> writes:
> You are right - a new index might mean that an existing HOT chain
> is broken as far as the new index is concerned. The way we address
> that is by indexing the root tuple of the chain, but the index key is
> extracted from the last tuple in the chain. The index is marked "unusable"
> for all those existing transactions which can potentially see any
> intermediate tuples in the chain.

I don't think that works --- what if the last tuple in the chain isn't
committed good yet? If its inserter ultimately rolls back, you've
indexed the wrong value.

> Please see this document written by Greg Stark. He has nicely summarized
> how CREATE INDEX and CREATE INDEX CONCURRENTLY works with HOT.
> http://archives.postgresql.org/pgsql-patches/2007-07/msg00360.php

Isn't the extra machination for C.I.C. just useless complication?
What's the point of avoiding creation of new broken HOT chains when
you still have to deal with existing ones?

>> I also don't think I
>> believe the reasoning for not indexing DELETE_IN_PROGRESS hot-updated
>> tuples: what if the index completion commits, but the concurrent delete
>> rolls back? Then you've got a valid tuple that's not in the index.

> Since CREATE INDEX works with ShareLock on the relation, we can
> safely assume that we can't find any DELETE_IN_PROGRESS tuples except
> those deleted by our own transaction. The only exception is system relation,
> but we don't do HOT updates on system relation.

That chain of reasoning is way too shaky for me. Essentially what
you're saying is you'll promise not to corrupt non-system indexes.
Nor am I really thrilled about having to disable HOT for system
catalogs.

> The only reason to redefine MaxHeapTuplesPerPage to higher side is
> because HOT allows us to accommodate more tuples per page because
> of redirect-dead line pointers.

No, it doesn't allow more tuples per page. It might mean there can be
more line pointers than that on a page, but not more actual tuples.
The question is whether there is any real use in allowing more line
pointers than that --- the limit is already unrealistically high,
since it assumes no data content in any of the tuples. If there is a
rationale for it then you should invent a different constant
MaxLinePointersPerPage or some such, but I really think there's no
point.

> Doubling the value is chosen as a balance between heap page
> utilization, line pointer bloating and overhead for bitmap scans.

I'd say it allows a ridiculous amount of line pointer bloat.

>> Even if it's safe, ISTM what you're mostly accomplishing there is to
>> expend a lot of cycles while holding exclusive lock on the page, when
>> there is good reason to think that you're blocking other people who are
>> interested in using the page. Eliminating the separation between that
>> and cleanup would also allow eliminating the separate "PD_FRAGMENTED"
>> page state.

> The reason we did it that way because repairing fragmentation seems
> much more costly that pruning. Please note that we prune a single
> chain during index fetch. Its only for heap-scans (and VACUUM) that
> we try to prune all chains in the page. So unless we are doing heap-scan,
> I am not sure if we are spending too much time holding the exclusive
> lock. I agree we don't have any specific numbers to prove that though.

If you don't have numbers proving that this extra complication has a
benefit, I'd vote for simplifying it. The SnapshotAny case is going to
bite other people besides you in future.

> Another reasoning behind separating these two steps is: pruning
> requires exclusive lock whereas repairing fragmentation requires
> cleanup lock.

This is nonsense. Those are the same lock. If you have the former and
not the latter, it just means that you *know* there is contention for
the page. It seems to me that performing optional maintenance work in
such a situation is completely wrong.

regards, tom lane

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Gavin M. Roy 2007-08-30 15:39:30 Re: pg_dump --no-tablespaces patch
Previous Message Gregory Stark 2007-08-30 10:20:04 Re: HOT patch - version 14