Re: Possible bug in vacuum redo

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Hiroshi Inoue" <Inoue(at)tpf(dot)co(dot)jp>
Cc: "Vadim Mikheev" <vmikheev(at)sectorbase(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Possible bug in vacuum redo
Date: 2001-12-22 16:13:34
Message-ID: 24087.1009037614@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Hiroshi Inoue" <Inoue(at)tpf(dot)co(dot)jp> writes:
> AFAIR t_ctid isn't logged in WAL.

After looking at the heap_update code I think you are right. Doesn't
that render the field completely useless/unreliable?

In the simple heap_update case I think that heap_xlog_update could
easily set the old tuple's t_ctid field correctly. Not sure how
it works when VACUUM is moving tuple chains around, however.

Another thing I am currently looking at is that I do not believe VACUUM
handles tuple chain moves correctly. It only enters the chain-moving
logic if it finds a tuple that is in the *middle* of an update chain,
ie, both the prior and next tuples still exist. In the case of a
two-element update chain (only the latest and next-to-latest tuples of
a row survive VACUUM), AFAICT vacuum will happily move the latest tuple
without ever updating the previous tuple's t_ctid.

In short t_ctid seems extremely unreliable. I have been trying to work
out a way that a bad t_ctid link could lead to the duplicate-tuple
reports we've been hearing lately, but so far I haven't seen one. I do
think it can lead to missed UPDATEs in read-committed mode, however.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2001-12-22 21:41:51 HISTORY file
Previous Message Hiroshi Inoue 2001-12-22 15:52:37 Re: Possible bug in vacuum redo