Quick Links

What is the right way to deal with a table with rows that are not in a random order?

From:	Douglas Alan <darkwater42(at)gmail(dot)com>
To:	pgsql-general(at)postgresql(dot)org
Subject:	What is the right way to deal with a table with rows that are not in a random order?
Date:	2009-05-27 23:53:44
Message-ID:	ce6334d00905271653r6e753c2dn5e9f56c735a1eede@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

We have a very large table (150 million rows) where the rows are not in a
random order. Some common queries will have millions of results, and this
of course is slow. For an interactive interface to the database, we want to
put a limit on all queries so that queries will return quickly even if there
are millions of results.

The problem we are seeing at the moment is that the statistics histograms
are actually often hurting us. Postgres sees that for some queries there
are so many results that it decides to do a sequential scan, rather than
using the index. Unfortunately, sometimes all of these millions of results
are at the end of the table, rather than being randomly distributed in the
table, so a sequential scan is the worst possible approach.

To fix this, we could use an "order by" clause to force Postgres to use the
index, but this isn't so easy as we are using Postgres through an ORM (i.e,
Django in this case), and it's not so easy to try to jury-rig things this
way on a per-query basis.

Alternatively, we could randomize the ordering of the table rows, but that
seems like an awfully big hammer, and will also prevent us from ordering the
table on a specific index, which is different from the index in question,
but might be correlated with it.

Is there a way of telling Postgres not to assume that the table rows are in
a random order, so that Postgres won't make the mistake of ever doing a
sequence scan on a huge table when there is a small limit?

Thanks!
|>ouglas

Responses

Re: What is the right way to deal with a table with rows that are not in a random order? at 2009-05-28 00:54:27 from Jeff Davis
Re: What is the right way to deal with a table with rows that are not in a random order? at 2009-05-28 08:14:44 from Simon Riggs

Browse pgsql-general by date

	From	Date	Subject
Next Message	Dave Page	2009-05-27 23:55:49	Re: vista failed to install postgresql
Previous Message	zxo102 ouyang	2009-05-27 23:40:00	Re: How to speed up the first-time-searching in pgsql?