Skip site navigation (1) Skip section navigation (2)

Re: switchover between index and sequential scans

From: Abhijit Menon-Sen <ams(at)oryx(dot)com>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: switchover between index and sequential scans
Date: 2008-07-03 11:45:50
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-performance
Hi Greg.

At 2008-07-03 11:05:46 +0100, stark(at)enterprisedb(dot)com wrote:
> And increase the statistics target on header_fields and re-analyze?

Aha! Thanks for the tip. I just changed the default_statistics_target to
100 (from 10) and ANALYSEd header_fields and mailbox_messages, and now
it ALWAYS uses the index scan if I specify a LIMIT. That is,

    select count(*) from header_fields where message in
    (select message from mailbox_messages limit N)

always uses the index scan on header_fields_message_key, even when N is
equal to the number of rows in mailbox_messages (109410).

 Aggregate  (cost=30779.98..30779.99 rows=1 width=0) (actual time=175040.923..175040.926 rows=1 loops=1)
   ->  Nested Loop  (cost=3279.73..30760.93 rows=7617 width=0) (actual time=2114.426..169137.088 rows=1771029 loops=1)
         ->  HashAggregate  (cost=3279.73..3281.73 rows=200 width=4) (actual time=2076.662..2649.541 rows=109365 loops=1)
               ->  Limit  (cost=0.00..1912.10 rows=109410 width=4) (actual time=0.029..1386.128 rows=109410 loops=1)
                     ->  Seq Scan on mailbox_messages  (cost=0.00..1912.10 rows=109410 width=4) (actual time=0.022..744.190 rows=109410 loops=1)
         ->  Index Scan using header_fields_message_key on header_fields  (cost=0.00..136.92 rows=38 width=4) (actual time=0.678..1.416 rows=16 loops=109365)
               Index Cond: (header_fields.message = "outer".message)
 Total runtime: 175041.496 ms

Note the massive _under_estimation in the hash aggregate and the
nestloop. If I don't specify a limit, it'll use a seq scan again.

Very interesting.

-- ams

In response to

pgsql-performance by date

Next:From: PFCDate: 2008-07-03 12:56:27
Subject: Re: [QUESTION]Concurrent Access
Previous:From: Mokhtari AmineDate: 2008-07-03 10:17:47
Subject: cursor/Fetch mechanisms under postgreSQL

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group