Re: Frequent Update Project: Design Overview of HOT

From: Hannu Krosing <hannu(at)skype(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Frequent Update Project: Design Overview of HOT
Date: 2006-11-10 07:06:35
Message-ID: 1163142396.3071.10.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Ühel kenal päeval, N, 2006-11-09 kell 18:28, kirjutas Tom Lane:
> "Simon Riggs" <simon(at)2ndquadrant(dot)com> writes:
> > As more UPDATEs take place these tuple chains would grow, making
> > locating the latest tuple take progressively longer.
>
> This is the part that bothers me --- particularly the random-access
> nature of the search. I wonder whether you couldn't do something
> involving an initial table fill-factor of less than 50%, and having
> the first updated version living on the same heap page as its parent.
> Only when the active chain length was more than one (which you
> hypothesize is rare) would it actually be necessary to do a random
> access into the overflow table.
>
> More generally, do we need an overflow table at all, rather than having
> these overflow tuples living in the same file as the root tuples? As
> long as there's a bit available to mark a tuple as being this special
> not-separately-indexed type, you don't need a special location to know
> what it is. This might break down in the presence of seqscans though.

And why do you need to mark it as not-separately-indexed at all ?

We already cope with missing index pointers in VACUUM and I can't see
any other reason to have it.

What are the advantages of HOT over SITC (other than cool name) ?

Maybe just make HOT an extended SITC which can span pages.

In case of HOT together with reusing index tuples with DELETED bit set
we don't actually need copyback, but the same index pointer will follow
the head of live data automatically, maybe lagging only a small number
of versions.

> Actually, you omitted to mention the locking aspects of moving tuples
> around --- exactly how are you going to make that work without breaking
> concurrent scans?
>
> > This allows the length of a typical tuple chain to be extremely short in
> > practice. For a single connection issuing a stream of UPDATEs the chain
> > length will no more than 1 at any time.
>
> Only if there are no other transactions being held open, which makes
> this claim a lot weaker.
>
> > HOT can only work in cases where a tuple does not modify one of the
> > columns defined in an index on the table, and when we do not alter the
> > row length of the tuple.
>
> Seems like "altering the row length" isn't the issue, it's just "is
> there room on the page for the new version". Again, a generous
> fillfactor would give you more flexibility.

Maybe they hoped to take very light locks when new chaoin head is copied
iver the old one in the same-length case.

> http://www.postgresql.org/about/donate
--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com

NOTICE: This communication contains privileged or other confidential
information. If you have received it in error, please advise the sender
by reply email and immediately delete the message and any attachments
without copying or disclosing the contents.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeremy Drake 2006-11-10 07:29:35 beta3 CFLAGS issue on openbsd
Previous Message Pavan Deolasee 2006-11-10 06:23:28 Re: Frequent Update Project: Design Overview of HOT Updates