Re: VACUUM/t_ctid bug (was Re: GiST concurrency commited)

From: Mario Weilguni <mweilguni(at)sime(dot)com>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: VACUUM/t_ctid bug (was Re: GiST concurrency commited)
Date: 2005-08-30 10:51:44
Message-ID: 200508301251.45018.mweilguni@sime.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Am Dienstag, 30. August 2005 12:19 schrieb Teodor Sigaev:
> > Since 7.4 we have troubles with ltree (seldom corruption of buffer cache,
> > not on-disk), might this bug be somehow related to the ltree problem? 7.2
> > was rock-stable with ltree.
>
> Not sure. Fixed bug was (@ - contains operation):
>
> update wow set a = a || '{101}'::int[] where a @ '{1,2,3}';
> select a from wow where a @ '{1,2,3}' and not a @ '{101}';
>
> After update query select must not find any rows, but it did. The problem
> was in GiST code and so any GiST idexes was affected.
>
> Can you say more about your trouble?

We have queries that use ltree for sorting too, the sort looks like this:
order by subpath(ltreefield, 0, nlevel(ltreefield) - 1)

But concurrency leads to a bug, that results in an sql-error:
ERROR: invalid positions

Now we use locking to prevent concurrenct access on the most-used concurrent
part of the program, and the problem is extremly rare now, I had only 4
occurences in one year, but still happens (there are other access paths that
do not use locking, but they are rareley accessed).

It seems the ltree length parameter is set to 0 in the tuples, the content
itself is still there: Example:
Say the tuple was before treefield='1.2.3.4.5'
After the occurence of the error, I get: treefield='' (empty, but not null)

Using a tool Tom Lane told me to use, I checked it, and on-disk I had still
"1.2.3.4.5", but the length parameter of the ltree column was 0 (sorry, I was
wrong in my first mail, on-disk was broken too.)

Might this be somehow related to the intarray bugs?

Best regards,
Mario Weilguni

p.s.: I tried hard to create a self-contained test for tracking this down, but
failed.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dave Cramer 2005-08-30 11:26:36 Re: SHMMAX seems entirely broken in OS X 10.4.2
Previous Message Teodor Sigaev 2005-08-30 10:19:38 Re: VACUUM/t_ctid bug (was Re: GiST concurrency commited)