CLUSTER and synchronized scans and pg_dump et al

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: CLUSTER and synchronized scans and pg_dump et al
Date: 2008-01-27 15:02:17
Message-ID: 87odb7s45i.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


It occurred to me the other day that synchronized scans could play havoc with
clustered tables. When you dump and reload a table even if it was recently
clustered if any other sequential scans are happening in the system at the
time you dump it the dump could shuffle the records out of order.

Now the records would still be effectively ordered for most purposes but our
statistics can't detect that. Since the correlation would be poor the restored
database would have markedly different statistics showing virtually no
correlation on the clustered column.

Perhaps we should have some form of escape hatch for pg_dump to request real
physical order when dumping clustered tables.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's 24x7 Postgres support!

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2008-01-27 17:38:52 Re: Simple row serialization?
Previous Message Ivan Voras 2008-01-27 12:03:09 Re: Simple row serialization?