Re: Much Ado About COUNT(*)

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: "Jonah H(dot) Harris" <jharris(at)tvi(dot)edu>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Much Ado About COUNT(*)
Date: 2005-01-13 04:15:58
Message-ID: 200501130415.j0D4FwK23210@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-announce pgsql-hackers pgsql-patches

Jonah H. Harris wrote:
> 1. Is there any answer to Bruce?s last statement in the thread, ?Re:
> [PERFORM] COUNT(*) again (was Re: Index/Function organized?
> (http://archives.postgresql.org/pgsql-hackers/2003-10/msg00245.php)

Let me give you my ideas in the above URL and why they are probably
wrong.

My basic idea was to keep a status bit on each index entry telling it if
a previous backend looked at the heap and determined it was valid. Here
are the details:

Each heap tuple stores the creation and expire xid. To determine if a
tuple is visible, you have to check the clog to see if the recorded
transaction ids were committed, in progress, or aborted. When you do
the lookup the first time and the transaction isn't in progress, you can
update a bit to say that the tuple is visible or not. In fact we have
several tuple bits:

#define HEAP_XMIN_COMMITTED 0x0100 /* t_xmin committed */
#define HEAP_XMIN_INVALID 0x0200 /* t_xmin invalid/aborted */
#define HEAP_XMAX_COMMITTED 0x0400 /* t_xmax committed */
#define HEAP_XMAX_INVALID 0x0800 /* t_xmax invalid/aborted */

Once set they allow a later backend access to know the visiblity of the
row without having to re-check clog.

The big overhead in index lookups is having to check the heap row for
visibility. My idea is that once you check the visibility the first
time in the heap, why can't we set some bit in the index so that later
index lookups don't have to look up the heap anymore. Over time most
index entries would have bits set and you wouldn't need to recheck the
heap. (Oh, and you could only update the bit when all active
transactions are newer than the creation transaction so we know they
should all see it as visible.)

Now, this would work for telling us that the transaction that created
the index entry was committed or aborted. Over time most index entries
would have that status.

I think the problem with this idea is expiration. If a row is deleted
we never go and update the index pointing to that heap, so the creation
status isn't enough for us to know that the row is valid.

I can't think of a way to fix this. I think it is expensive to clear
the status bits on a row delete because finding an index row that goes
with a particular heap is quite expensive.

So, those are my ideas. If they could be made to work it would give us
most of the advantages of an index scan with _few_ heap lookups with
very little overhead.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Responses

Browse pgsql-announce by date

  From Date Subject
Next Message Tom Lane 2005-01-13 04:54:52 Re: Much Ado About COUNT(*)
Previous Message Tom Lane 2005-01-13 03:50:57 Re: Much Ado About COUNT(*)

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2005-01-13 04:54:52 Re: Much Ado About COUNT(*)
Previous Message Tom Lane 2005-01-13 03:50:57 Re: Much Ado About COUNT(*)

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2005-01-13 04:54:52 Re: Much Ado About COUNT(*)
Previous Message Tom Lane 2005-01-13 03:50:57 Re: Much Ado About COUNT(*)