Optimizer degradation since 8.0

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Optimizer degradation since 8.0
Date: 2006-08-21 12:11:54
Message-ID: 44E9A30A.8080305@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Table with one million rows:

wow=# \d _document83
Table "public._document83"
Column | Type | Modifiers
------------+-----------------------------+-----------
_idrref | bytea | not null
_marked | boolean | not null
_date_time | timestamp without time zone | not null
_number | character(10) | not null
_posted | boolean | not null
Indexes:
"_document83ng_pkey" PRIMARY KEY, btree (_idrref)
"_document83_bydocdate_tr" btree (_date_time, _idrref)
"qq" btree (_date_time)

Query:
SELECT
_Date_Time,
_IDRRef,
FROM
_Document83
WHERE
_Date_Time = '2006-06-21 11:24:56'::timestamp AND
_IDRRef > '\\234?PPTP00\\021\\333\\000\\367\\004:v:'::bytea
OR _Date_Time > '2006-06-21 11:24:56'::timestamp
LIMIT 20;

Explain analyze in postgres 8.0:
Limit (cost=0.00..0.63 rows=20 width=44) (actual time=0.250..0.485 rows=20
loops=1)
-> Index Scan using qq, qq on _document83 (cost=0.00..6679.90 rows=211427
width=44) (actual time=0.238..0.416 rows=20 loops=1)
Index Cond: ((_date_time = '2006-06-21 11:24:56'::timestamp without
time zone) OR (_date_time > '2006-06-21 11:24:56'::timestamp without time zone))
Filter: (((_date_time = '2006-06-21 11:24:56'::timestamp without time
zone) AND (_idrref > '\\234?PPTP00\\021\\333\\000\\367\\004:v:'::bytea)) OR
(_date_time > '2006-06-21 11:24:56'::timestamp without time zone))
Total runtime: 2.313 ms

Explain analyze in postgres 8.1:
Limit (cost=0.00..2.82 rows=20 width=44) (actual time=1448.897..1610.386
rows=20 loops=1)
-> Seq Scan on _document83 (cost=0.00..29729.04 rows=210782 width=44)
(actual time=1448.889..1610.314 rows=20 loops=1)
Filter: ((_date_time > '2006-06-21 11:24:56'::timestamp without time
zone) OR ((_date_time = '2006-06-21 11:24:56'::timestamp without time zone) AND
(_idrref > E'\\234?PPTP00\\021\\333\\000\\367\\004:v:'::bytea)))
Total runtime: 1610.524 ms
(4 rows)

With ENABLE_SEQSCAN=OFF:
Limit (cost=1319.83..1321.23 rows=20 width=44) (actual time=193.261..193.382
rows=20 loops=1)
-> Bitmap Heap Scan on _document83 (cost=1319.83..16029.62 rows=210782
width=44) (actual time=193.253..193.314 rows=20 loops=1)
Recheck Cond: (((_date_time = '2006-06-21 11:24:56'::timestamp without
time zone) AND (_idrref > E'\\234?PPTP00\\021\\333\\000\\367\\004:v:'::bytea))
OR (_date_time > '2006-06-21 11:24:56'::timestamp without time zone))
-> BitmapOr (cost=1319.83..1319.83 rows=210788 width=0) (actual
time=191.203..191.203 rows=0 loops=1)
-> Bitmap Index Scan on _document83_bydocdate_tr
(cost=0.00..2.18 rows=30 width=0) (actual time=2.470..2.470 rows=43 loops=1)
Index Cond: ((_date_time = '2006-06-21
11:24:56'::timestamp without time zone) AND (_idrref >
E'\\234?PPTP00\\021\\333\\000\\367\\004:v:'::bytea))
-> Bitmap Index Scan on qq (cost=0.00..1317.65 rows=210758
width=0) (actual time=188.720..188.720 rows=275800 loops=1)
Index Cond: (_date_time > '2006-06-21 11:24:56'::timestamp
without time zone)
Total runtime: 193.872 ms

So, 8.0 is better at least in 100 times. Expected number of rows is close to
real value ( ~270000 ). Rewriting query with UNION ALL makes better performance
(about 1 ms):
Limit (cost=0.00..0.73 rows=20 width=44) (actual time=0.654..0.851 rows=20
loops=1)
-> Append (cost=0.00..7712.53 rows=210788 width=44) (actual
time=0.648..0.791 rows=20 loops=1)
-> Index Scan using qq on _document83 (cost=0.00..6.42 rows=30
width=44) (actual time=0.645..0.733 rows=20 loops=1)
Index Cond: (_date_time = '2006-06-21 11:24:56'::timestamp
without time zone)
Filter: (_idrref >
E'\\234?PPTP00\\021\\333\\000\\367\\004:v:'::bytea)
-> Index Scan using qq on _document83 (cost=0.00..5598.23
rows=210758 width=44) (never executed)
Index Cond: (_date_time > '2006-06-21 11:24:56'::timestamp
without time zone)
Total runtime: 1.059 ms
But it's not always possible to rewrite automatically generated query...

After adding 'order by', postgres uses index but plan becomes the same as
before, with seqscan=off.

Can I tweak something in 8.1 or it's a bug?

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2006-08-21 12:46:47 Re: Replication
Previous Message Magnus Hagander 2006-08-21 11:26:11 Re: Bug with initDB under windows 2003