Quick Links

Re: What is the right way to deal with a table with rows that are not in a random order?

From:	Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To:	Douglas Alan <darkwater42(at)gmail(dot)com>
Cc:	Jeff Davis <pgsql(at)j-davis(dot)com>, pgsql-general(at)postgresql(dot)org
Subject:	Re: What is the right way to deal with a table with rows that are not in a random order?
Date:	2009-05-28 14:24:32
Message-ID:	dcc563d10905280724g5b51b4fbl6ff626e3114147b7@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Wed, May 27, 2009 at 8:43 PM, Douglas Alan <darkwater42(at)gmail(dot)com> wrote:
> On Wed, May 27, 2009 at 8:54 PM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
>
>>
>> If you're putting a LIMIT on it, why does it return millions of results?
>
> It doesn't return millions of results with the LIMIT on it. It just does a
> sequential scan of the table and doesn't find any results until it gets to
> the last quarter of the table. Sequentially scanning through 3/4 of the huge
> table before it gets a single match takes a very long time.
>
> As I said, in my original post, Postgres's approach would be completely
> reasonable in this case, if the rows that it was looking for were sprinkled
> randomly throughout the table. But they're not in this case -- they're all
> at the end.
>
>> Can you pick out an interesting query and give some specifics, like:
>> * the query
>> * the EXPLAIN ANALYZE output (or EXPLAIN without ANALYZE if it takes too
>> long to even run once)
>> * EXPLAIN ANALYZE output if you force the index scan
>> * the statistics for the relevant columns, such as histogram and
>> correlation
>
> As I mentioned, the situation is very simple, and easy to understand what is
> going on. There's absolutely no mystery as to why Postgres is doing what
> it's doing. 25% of the table matches the query. Postgres knows this due to
> the statistics histogram for the column. Postgres is deciding to do a
> sequential scan because it knows that 25% of the rows match the query.
>
> Unfortunately, in this case, that's a poor approach.

Postgres has a very smart query planner, that has a lot of knobs you
can fiddle with to change how and when it changes from one plan to
another. It's possible by changing some of those that pgsql will use
a different query plan that is more efficient. Seeing things like
explain analyze might help someone make a suggestion. OTOH, if you've
got it all sussed out, then ignore the request for more information.

In response to

Re: What is the right way to deal with a table with rows that are not in a random order? at 2009-05-28 02:43:20 from Douglas Alan

Responses

Re: What is the right way to deal with a table with rows that are not in a random order? at 2009-05-28 14:41:23 from Scott Marlowe
Re: What is the right way to deal with a table with rows that are not in a random order? at 2009-05-28 19:12:58 from Douglas Alan

Browse pgsql-general by date

	From	Date	Subject
Next Message	Scott Marlowe	2009-05-28 14:28:04	Re: ubuntu 9.04 and auto-start
Previous Message	Marcos Davi Reis	2009-05-28 14:12:12	Debugger install