Skip site navigation (1) Skip section navigation (2)

CLUSTER loses nulls (was Re: [ADMIN] Still a bug in the VACUUM)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: a(dot)schmitz(at)cityweb(dot)de
Cc: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>,Teodor Sigaev <teodor(at)stack(dot)net>, pgsql-hackers(at)postgreSQL(dot)org
Subject: CLUSTER loses nulls (was Re: [ADMIN] Still a bug in the VACUUM)
Date: 2003-02-28 23:54:13
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-adminpgsql-hackers
I said:
> I was able to reproduce a problem as follows: run the tsearch regression
> test, then do "cluster wowidx on test_txtidx".  This appears to lose
> one row:

Ahh ... it took me way too long to realize what was happening.  The
problem is simply that GiST indexes do not index nulls (at least not in
the first column of an index).  So if you CLUSTER, you lose any rows
that contain NULLs in the indexed column --- they're not in the index,
so they're not seen by the indexscan that copies the data over to the
new table.

Having CLUSTER lose data is obviously not acceptable :-(.  I can see two
possible solutions:

* Make CLUSTER error out if the target index is not of an 'amindexnulls'
index AM.  This would amount to restricting CLUSTER to b-trees, which is

* If the index is not amindexnulls and the first target column is not
marked attnotnull, make an extra seqscan pass over the source table to
look for rows containing nulls.  Copy these rows separately.  This would
work but adds a good deal of overhead.

Approach #2 is even worse for functional indexes --- attnotnull is not
helpful.  We'd have to actually evaluate the function at every single
row to see if it yields NULL there.  Yech.

It occurs to me also that the same kind of pitfall exists for partial
indexes: cluster on a partial index, you lose.  However, I don't have
a problem with simply refusing to cluster on partial indexes.

Comments?  Anyone want to do the legwork to fix this?

			regards, tom lane

In response to

pgsql-hackers by date

Next:From: Rod TaylorDate: 2003-03-01 05:11:37
Subject: Re: Foreign key quandries
Previous:From: Oliver ElphickDate: 2003-02-28 23:14:08
Subject: Re: CHECK constraints in pg_dump

pgsql-admin by date

Next:From: Bruce YoungDate: 2003-03-01 00:07:38
Subject: Hiding passwords in PHP files
Previous:From: Robert TreatDate: 2003-02-28 22:17:28
Subject: Re: Cross-database references

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group