Re: Frequent Update Project: Design Overview of HOTUpdates

From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Frequent Update Project: Design Overview of HOTUpdates
Date: 2006-11-10 12:19:50
Message-ID: 1163161191.3634.668.camel@silverbirch.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 2006-11-09 at 18:28 -0500, Tom Lane wrote:
> "Simon Riggs" <simon(at)2ndquadrant(dot)com> writes:
> > As more UPDATEs take place these tuple chains would grow, making
> > locating the latest tuple take progressively longer.
>
> This is the part that bothers me --- particularly the random-access
> nature of the search. I wonder whether you couldn't do something
> involving an initial table fill-factor of less than 50%, and having
> the first updated version living on the same heap page as its parent.
> Only when the active chain length was more than one (which you
> hypothesize is rare) would it actually be necessary to do a random
> access into the overflow table.

Thats appropriate sometimes, not others, but I'll investigate this
further so that its possible to take advantage of non-zero fillfactors
when they exist.

There's a number of distinct use-cases here:

If you have a very small, heavily updated table it makes a lot of sense
to use lower fillfactors as well.

If you have a larger table, using fillfactor 50% immediately doubles the
size of the table. If the updates are uneven, as they mostly are because
of the Benfold distribution/Pareto principle, then it has been found
that leaving space on block doesn't help the heavily updated portions of
a table, whereas it hinders the lightly updated portions of a table.

TPC-C and TPC-B both have uniformly distributed UPDATEs, so its easy to
use the fillfactor to great advantage there.

> More generally, do we need an overflow table at all, rather than having
> these overflow tuples living in the same file as the root tuples? As
> long as there's a bit available to mark a tuple as being this special
> not-separately-indexed type, you don't need a special location to know
> what it is. This might break down in the presence of seqscans though.

HOT currently attempts to place a subsequent UPDATE on the same page of
the overflow relation, but this doesn't happen (yet) for placing
multiple versions on same page. IMHO it could, but will think about it.

> > This allows the length of a typical tuple chain to be extremely short in
> > practice. For a single connection issuing a stream of UPDATEs the chain
> > length will no more than 1 at any time.
>
> Only if there are no other transactions being held open, which makes
> this claim a lot weaker.

True, but Nikhil has run tests that clearly show HOT outperforming
current situation in the case of long running transactions. The need to
optimise HeapTupleSatisfiesVacuum() and avoid long chains does still
remain a difficulty for both HOT and the current situation.

> > HOT can only work in cases where a tuple does not modify one of the
> > columns defined in an index on the table, and when we do not alter the
> > row length of the tuple.
>
> Seems like "altering the row length" isn't the issue, it's just "is
> there room on the page for the new version". Again, a generous
> fillfactor would give you more flexibility.

The copy-back operation can only work if the tuple fits in the same
space as the root tuple. If it doesn't you end up with a tuple
permanently in the overflow relation. That might not worry us, I guess.

Also, my understanding was that an overwrite operation could not vary
the length of a tuple (at least according to code comments).

> > [We'll be able to do that more efficiently when
> > we have plan invalidation]
>
> Uh, what's that got to do with it?

Currently the HOT code dynamically tests to see if the index columns
have been touched. If we had plan invalidation that would be able to be
assessed more easily at planning time, in cases where there is no BEFORE
trigger.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2006-11-10 12:32:26 Re: Frequent Update Project: Design Overview of HOTUpdates
Previous Message Peter Eisentraut 2006-11-10 11:40:35 Re: beta3 CFLAGS issue on openbsd