Re: HOT patch - version 15

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: HOT patch - version 15
Date: 2007-09-08 02:08:37
Message-ID: 200709080208.l8828bI29742@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

Simon Riggs wrote:
> > We could begin pruning only when the chain is N long. Currently N=2, but
> > we could set N=3+ easily enough. There's no code yet to actually count
> > that, but we can do that easily as we do each lookup. We should also be
> > able to remember the visibility result for each tuple in the chain to
> > decide whether pruning will be effective or not.
> >
> > I would say that if the buffer is already dirty and the chain is
> > prunable, we should prune it at the first possible chance.
>
> If we defer pruning until the page is full, worst case we may could end
> up with a chain ~240 tuples long, which might need to be scanned
> repeatedly. That won't happen often, but it would be better to prune
> whenever we hit one of these conditions
> - when the block is full
> - when we reach the 16th tuple in a chain

I don't see how following a HOT chain is any slower than following an
UPDATE chain like we do now. In fact, what we do now is to have an
index row for every row version, while with HOT we will have one index
entry and all the row versions on the same page. That has to be better
than what we have now, even without pruning.

Let me define two terms:

prune -- modify ctid pointers to reduce the length of one HOT chain

defragment -- reduce the length of all HOT chains on the page
and collect free space

I think we all agree defragmenting should only happen when we need free
space on the page. But it seems by the time we _know_ we are going to
be adding to the page we can't defragment. I am thinking we should go
the direction of passing a boolean down into the routines so they know
to defragment before they get into a case where they can't.

As for pruning, I am thinking Simon's idea of just saying don't prune
unless the chain is longer than X is correct. If there are a lot of
updates to the page the page will fill and the chain pruned during a
defragment, and if not the chains are guaranteed to be shorter than X.

Ideally you would want to say if the chain has been Y for a certain
length of time (meaning it isn't growing and defragement hasn't happened
in a while) you would start to prune more aggressively but I see no
easy way to track that.

Also, why all the talk of index lookups doing pruning? Can't a
sequential scan do pruning?

Would someone tell us exactly when pruning and defragmenting happens in
the current version of the patch? If we don't nail this issue down soon
PostgreSQL 8.3 is going to sail without this patch.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2007-09-08 02:25:43 Re: WIP patch for latestCompletedXid method of computing snapshot xmax
Previous Message Bruce Momjian 2007-09-08 01:56:12 Re: HOT patch - version 15