Re: Index trouble with 8.3b4

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Hannes Dorbath <light(at)theendofthetunnel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Index trouble with 8.3b4
Date: 2008-01-13 23:52:42
Message-ID: 10733.1200268362@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Oooh ... I can't be sure that this is what's biting you, but I
definitely see a bug that seems to match the symptoms. As the comments
in index.c point out, CREATE INDEX CONCURRENTLY works like this:

* validate_index() works by first gathering all the TIDs currently in the
* index, using a bulkdelete callback that just stores the TIDs and doesn't
* ever say "delete it". (This should be faster than a plain indexscan;
* also, not all index AMs support full-index indexscan.) Then we sort the
* TIDs, and finally scan the table doing a "merge join" against the TID list
* to see which tuples are missing from the index.

The scan is done using the regular heapscan code, which in 8.3 has been
modified to enable "synchronized scanning", which means it might start
from the middle of the table and wrap around. If that happens, the
"merge join" will get totally confused because it is expecting the
tuples to be returned in increasing ctid order. This will result in
misidentifying a bunch of TIDs as not being in the table, allowing
duplicate entries to be made in the index. And both of the misbehaviors
you originally showed can be explained by duplicate index entries
(actual or attempted).

Furthermore, the first duplicate TIDs to be entered will tend to be
low-numbered TIDs, which explains why you were consistently getting
GIN complaints about low-numbered TIDs, which I was having a hard time
thinking of a mechanism for otherwise.

I can now reproduce the failure: the trick is to get the syncscan start
pointer to not be on page zero. For example,

-- load up table
begin;
declare c cursor for select id from test;
fetch 10000 from c;
commit;
CREATE INDEX CONCURRENTLY cluster_test ON public.test USING gin (tsv);
ERROR: item pointer (0,1) already exists

I think it's okay for CREATE INDEX CONCURRENTLY to use bulk-read access
strategy (that is, seqscan using a limited number of buffers), but it
has to be able to force the scan to start at page zero. Right now,
heapam.c doesn't offer any API to control this, but we can certainly
add one.

I wonder whether there are any other places that are silently assuming
that heapscans start from page zero ...

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2008-01-14 00:15:21 Re: Index trouble with 8.3b4
Previous Message Satch Jones 2008-01-13 23:32:31 tsearch2 install on Fedora Core 5 problems

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2008-01-14 00:01:26 Re: Postgresql Materialized views
Previous Message Joshua D. Drake 2008-01-13 22:41:57 Re: Distinguishing autovacuum activity in pg_stat_activity