Why dose the planner select one bad scan plan.

From: 静安寺 <asen_huang(at)qq(dot)com>
To: pgsql-bugs <pgsql-bugs(at)postgresql(dot)org>, pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Why dose the planner select one bad scan plan.
Date: 2010-11-10 09:37:29
Message-ID: tencent_0444078C7AA9F76A491E2841@qq.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-performance

I use the postgresql in default configuration and use inheritance way to create table.

My postgresql version is:

SELECT version();

version

--------------------------------------------------------------------------------

PostgreSQL 9.0.1 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.5, 32-bit

(1 row)



Reboot the computer to avoid memory cache. And then get the following explain:

EXPLAIN ANALYZE SELECT authdomain,authuser,count(*),sum(SIZE) FROM tbltrafficlog WHERE (PROTOCOL in ('HTTP','HTTPS','FTP')) and (TIME >= '2010-10-01 00:00:00' AND TIME < '2010-11-01 00:00:00') GROUP BY authdomain,authuser order by count(*) DESC LIMIT 10 OFFSET 0;

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Limit (cost=600830.83..600830.86 rows=10 width=19) (actual time=225034.470..225034.483 rows=10 loops=1)

-> Sort (cost=600830.83..600833.25 rows=968 width=19) (actual time=225034.469..225034.473 rows=10 loops=1)

Sort Key: (count(*))

Sort Method: top-N heapsort Memory: 17kB

-> HashAggregate (cost=600795.40..600809.92 rows=968 width=19) (actual time=225018.666..225019.522 rows=904 loops=1)

-> Append (cost=0.00..535281.08 rows=6551432 width=19) (actual time=4734.441..205514.878 rows=7776000 loops=1)

-> Seq Scan on tbltrafficlog (cost=0.00..11.50 rows=1 width=298) (actual time=0.001..0.001 rows=0 loops=1)

Filter: (("time" >= '2010-10-01 00:00:00'::timestamp without time zone) AND ("time" < '2010-11-01 00:00:00'::timestamp without time zone) AND ((protocol)::text = ANY ('{HTTP,HTTPS,FTP}'::text[])))

-> Bitmap Heap Scan on tbltrafficlog_20101001 tbltrafficlog (cost=4471.33..17819.25 rows=218129 width=19) (actual time=4734.437..6096.206 rows=259200 loops=1)

Recheck Cond: ((protocol)::text = ANY ('{HTTP,HTTPS,FTP}'::text[]))

Filter: (("time" >= '2010-10-01 00:00:00'::timestamp without time zone) AND ("time" < '2010-11-01 00:00:00'::timestamp without time zone))

-> Bitmap Index Scan on tbltrafficlog_20101001_protocol_idx (cost=0.00..4416.80 rows=218129 width=0) (actual time=4731.860..4731.860 rows=259200 loops=1)

Index Cond: ((protocol)::text = ANY ('{HTTP,HTTPS,FTP}'::text[]))



-> Bitmap Heap Scan on tbltrafficlog_20101030 tbltrafficlog (cost=4472.75..17824.12 rows=218313 width=19) (actual time=4685.536..6090.222 rows=259200 loops=1)

Recheck Cond: ((protocol)::text = ANY ('{HTTP,HTTPS,FTP}'::text[]))

Filter: (("time" >= '2010-10-01 00:00:00'::timestamp without time zone) AND ("time" < '2010-11-01 00:00:00'::timestamp without time zone))

-> Bitmap Index Scan on tbltrafficlog_20101030_protocol_idx (cost=0.00..4418.17 rows=218313 width=0) (actual time=4677.147..4677.147 rows=259200 loops=1)

Index Cond: ((protocol)::text = ANY ('{HTTP,HTTPS,FTP}'::text[]))

Total runtime: 225044.255 ms



Reboot the computer again. And then I close bitmap scan manually and get the following explain:

SET SET enable_bitmapscan TO off;

EXPLAIN ANALYZE SELECT authdomain,authuser,count(*),sum(SIZE) FROM tbltrafficlog WHERE (PROTOCOL in ('HTTP','HTTPS','FTP')) and (TIME >= '2010-10-01 00:00:00' AND TIME < '2010-11-01 00:00:00') GROUP BY authdomain,authuser order by count(*) DESC LIMIT 10 OFFSET 0;

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=634901.26..634901.28 rows=10 width=19) (actual time=83805.465..83805.477 rows=10 loops=1)

-> Sort (cost=634901.26..634903.68 rows=968 width=19) (actual time=83805.463..83805.467 rows=10 loops=1)

Sort Key: (count(*))

Sort Method: top-N heapsort Memory: 17kB

-> HashAggregate (cost=634865.82..634880.34 rows=968 width=19) (actual time=83789.686..83790.540 rows=904 loops=1)

-> Append (cost=0.00..569351.50 rows=6551432 width=19) (actual time=0.010..64393.284 rows=7776000 loops=1)

-> Seq Scan on tbltrafficlog (cost=0.00..11.50 rows=1 width=298) (actual time=0.001..0.001 rows=0 loops=1)

Filter: (("time" >= '2010-10-01 00:00:00'::timestamp without time zone) AND ("time" < '2010-11-01 00:00:00'::timestamp without time zone) AND ((protocol)::text = ANY ('{HTTP,HTTPS,FTP}'::text[])))

-> Seq Scan on tbltrafficlog_20101001 tbltrafficlog (cost=0.00..18978.00 rows=218129 width=19) (actual time=0.008..1454.757 rows=259200 loops=1)

Filter: (("time" >= '2010-10-01 00:00:00'::timestamp without time zone) AND ("time" < '2010-11-01 00:00:00'::timestamp without time zone) AND ((protocol)::text = ANY ('{HTTP,HTTPS,FTP}'::text[])))



-> Seq Scan on tbltrafficlog_20101030 tbltrafficlog (cost=0.00..18978.00 rows=218313 width=19) (actual time=0.025..1483.817 rows=259200 loops=1)

Filter: (("time" >= '2010-10-01 00:00:00'::timestamp without time zone) AND ("time" < '2010-11-01 00:00:00'::timestamp without time zone) AND ((protocol)::text = ANY ('{HTTP,HTTPS,FTP}'::text[])))

Total runtime: 83813.808 ms



Okay, 225044.255ms VS 83813.808 ms, it obviously seems that the planner select one bad scan plan by default.

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Jin 2010-11-10 10:08:33 BUG #5745: geometry bug?
Previous Message Arturas Mazeika 2010-11-10 06:37:16 Re: BUG #5735: pg_upgrade thinks that it did not start the old server

Browse pgsql-performance by date

  From Date Subject
Next Message Mark Kirkwood 2010-11-10 09:42:39 Re: Array interface
Previous Message Mark Kirkwood 2010-11-10 09:10:39 Re: Array interface