Reproducible vacuum complaint!

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Reproducible vacuum complaint!
Date: 1999-11-21 23:00:29
Message-ID: 10168.943225229@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I have devised a simple manual way of reproducing that peculiar VACUUM
notice that Oleg has been complaining about, but didn't have a reliable
way of triggering on-demand. It looks like it is caused by some sort of
bug in the transaction commit logic --- or maybe just VACUUM's piece of
it, but anyway there is something mucho bad going on here.

Setup:

create table hits (msg_id int, nhits int);
create index hits_pkey on hits(msg_id);
insert into hits values(42,0);
insert into hits values(43,0);

Given this setup, you can do

drop index hits_pkey;
update hits set nhits = nhits+1 where msg_id = 42;
create index hits_pkey on hits(msg_id);
vacuum analyze hits;

all day with no problem.

BUT: start up another psql, and in that other psql begin a transaction
block and touch anything at all --- doesn't have to be the table under
test:

begin;
select * from int4_tbl;

Now, *without committing* that other transaction, go back to the first
psql and try again:

drop index hits_pkey;
update hits set nhits = nhits+1 where msg_id = 42;
create index hits_pkey on hits(msg_id);
vacuum analyze hits;
NOTICE: Index hits_pkey: NUMBER OF INDEX' TUPLES (2) IS NOT THE SAME AS HEAP' (3).
Try recreating the index.

You can repeat the vacuum (with or without analyze) as often as you want
and you'll get the same notice each time. If you do more UPDATEs, the
reported number of heap tuples increases --- rather odd, considering
there are obviously only two committed tuples in the table (as can be
confirmed by a SELECT).

As soon as you commit or abort the other transaction, everything goes
back to normal.

There are variants of this sequence that also cause the problem. The
critical factor seems to be that both the index itself and at least one
tuple in the table have to be younger than the oldest uncommitted
transaction.

At this point I decided that I was in over my head, so I'm tossing the
whole mess in Vadim's direction. I can't tell whether VACUUM itself
is confused or the transaction logic in general is, but it sure looks
like something is looking at the wrong xact to decide whether tuples
have been committed or not. This could be a symptom of a fairly serious
logic error down inside tuple time qual checks...

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hiroshi Inoue 1999-11-22 00:29:17 RE: [HACKERS] Reproducible vacuum complaint!
Previous Message Tom Lane 1999-11-21 19:47:06 locking needed for parsing & planning