Re: problems with table corruption continued

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Brian Hirt" <bhirt(at)mobygames(dot)com>
Cc: Vadim Mikheev <vmikheev(at)sectorbase(dot)com>, "Postgres Hackers" <pgsql-hackers(at)postgresql(dot)org>, "Brian A Hirt" <bhirt(at)berkhirt(dot)com>
Subject: Re: problems with table corruption continued
Date: 2001-12-18 18:48:57
Message-ID: 17433.1008701337@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Brian Hirt" <bhirt(at)mobygames(dot)com> writes:
> [ example case that creates a duplicate tuple ]

Got it. The problem arises in this section of vacuum.c, near the bottom
of repair_frag:

if (!(tuple.t_data->t_infomask & HEAP_XMIN_COMMITTED))
{
if ((TransactionId) tuple.t_data->t_cmin != myXID)
elog(ERROR, "Invalid XID in t_cmin (3)");
if (tuple.t_data->t_infomask & HEAP_MOVED_OFF)
{
itemid->lp_flags &= ~LP_USED;
num_tuples++;
}
else
elog(ERROR, "HEAP_MOVED_OFF was expected (2)");
}

This is trying to get rid of the original copy of a tuple that's been
moved to another page. The problem is that your index function causes a
table scan, which means that by the time control gets here, someone else
has looked at this tuple and marked it good --- so the initial test of
HEAP_XMIN_COMMITTED fails, and the tuple is never removed!

I would say that it's incorrect for vacuum.c to assume that
HEAP_XMIN_COMMITTED can't become set on HEAP_MOVED_OFF/HEAP_MOVED_IN
tuples during the course of vacuum's processing; after all, the xmin
definitely does refer to a committed xact, and we can't realistically
assume that we know what processing will be induced by user-defined
index functions. Vadim, what do you think? How should we fix this?

In the meantime, Brian, I think you ought to get rid of your index
CREATE INDEX "developer_aka_search_idx" on "developer_aka" using btree ( developer_aka_search_name ("developer_aka_id") "varchar_ops" );
I do not think this index is well-defined: an index function ought
to have the property that its output depends solely on its input and
cannot change over time. This function cannot make that claim.
(In 7.2, you can't even create the index unless you mark the function
iscachable, which is really a lie.) I'm not even real sure what you
expect the index to do for you...

I do not know whether this effect explains all the reports of duplicate
tuples we've seen in the last few weeks. We need to ask whether the
other complainants were using index functions that tried to do table
scans.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jason Earl 2001-12-18 18:53:04 Re: Concerns about this release
Previous Message Brian Hirt 2001-12-18 18:07:26 Re: problems with table corruption continued