Optimizer confusion?

From: Philip Warner <pjw(at)rhyme(dot)com(dot)au>
To: pgsql-hackers(at)hub(dot)org, pgsql-general(at)hub(dot)org
Subject: Optimizer confusion?
Date: 2000-08-12 05:27:26
Message-ID: 3.0.5.32.20000812152726.01f51210@mail.rhyme.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers


I have a table for which the SEQSCAN and INDEXSCAN estimates are the same
up to a point, after which the SEQSCAN estimates remain fixed, and the
indexscan estimates continue to grow. However, the actual speed of the
index scan is superior for a much greater period than the optimizer predicts.

The database has a table 'ping' with various fields including a 'pingtime
timestamp'; it also has a btree indexe on the date and has been 'vacuum
analyze'-ed. There are about 200000 rows and the data is evenly distributed
in 5 minute intervals.

These are the results from 'explain':

------ 1 ----
uptime=# explain select * from ping where pingtime>'1-aug-1999' and
pingtime<'1-aug-1999';
NOTICE: QUERY PLAN:

Index Scan using ping_ix1 on ping (cost=0.00..4.28 rows=1 width=52)
------

This seems fine, even if the query is bogus.

------ 2 ----
uptime=# explain select * from ping where pingtime>'1-aug-1999' and
pingtime<'2-aug-1999';
NOTICE: QUERY PLAN:

Index Scan using ping_ix1 on ping (cost=0.00..1679.29 rows=561 width=52)
------

Also looks OK.

------ 3 ----
uptime=# explain select * from ping where pingtime>'1-aug-1999' and
pingtime<'3-aug-1999';
NOTICE: QUERY PLAN:

Index Scan using ping_ix1 on ping (cost=0.00..3091.18 rows=1123 width=52)
------

This seems OK; the estimate is roughly double the previous, which is to be
expected, I think.

------- 5 ----
uptime=# explain select * from ping where pingtime>'1-aug-1999' and
pingtime<'5-aug-1999';
NOTICE: QUERY PLAN:

Index Scan using ping_ix1 on ping (cost=0.00..5386.70 rows=2245 width=52)
------

Again. this is OK, although I am a little surprised at the continuing
non-linearity of the estimates.

Now it starts getting very strange:

------- 5+a bit ----
uptime=# explain select * from ping where pingtime>'1-aug-1999' and
pingtime<'5-aug-1999 20:25';
NOTICE: QUERY PLAN:

Seq Scan on ping (cost=0.00..6208.68 rows=2723 width=52)
-------

OK so far, but look at the following (the costs are the same):

------- 3 Months ----
uptime=# explain select * from ping where pingtime>'1-aug-1999' and
pingtime<'1-nov-1999';
NOTICE: QUERY PLAN:

Seq Scan on ping (cost=0.00..6208.68 rows=51623 width=52)
-------

and

------- 5 + a YEAR ----
uptime=# explain select * from ping where pingtime>'1-aug-1999' and
pingtime<'5-aug-2000 20:25';
NOTICE: QUERY PLAN:

Seq Scan on ping (cost=0.00..6208.68 rows=208184 width=52)
------

Now what is also strange, is if I set ENABLE_SEQSCAN=OFF, then the
estimates up to '5+a bit' are the *same*, but the running time is
substantially better for index scan. In fact the running time is better for
index scans up to an interval of about three months. I presume there is
something wrong with the selectivify estimates for the index.

I really don't want to have the code call 'SET ENABLE_SEQSCAN=OFF/ON'
around this statement, since for a longer period, I do want a sequential
scan. And building my own 'query optimizer' which says 'if time diff > 3
months, then enable seqscan' seems like a very bad idea.

I would be interested to know (a) if there is any way I can influence the
optimizer choice when it considers using the index in question, and (b) if
the fixed seqscan cost estimate is a bug.

FWIW, the output of a 3 month period with ENABLE_SEQSCAN=OFF is:

-----
uptime=# set enable_seqscan=off;
uptime=# explain select * from ping where pingtime>'1-aug-1999' and
pingtime<'1-nov-1999';
NOTICE: QUERY PLAN:

Index Scan using ping_ix1 on ping (cost=0.00..27661.01 rows=51623 width=52)
-----

Any help, explanation, etc would be appreciated.

----------------------------------------------------------------
Philip Warner | __---_____
Albatross Consulting Pty. Ltd. |----/ - \
(A.B.N. 75 008 659 498) | /(@) ______---_
Tel: (+61) 0500 83 82 81 | _________ \
Fax: (+61) 0500 83 82 82 | ___________ |
Http://www.rhyme.com.au | / \|
| --________--
PGP key available upon request, | /
and from pgp5.ai.mit.edu:11371 |/

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2000-08-12 06:15:07 Re: [HACKERS] Optimizer confusion?
Previous Message XWorkers 2000-08-12 01:00:58 turning of referential integrity

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2000-08-12 06:15:07 Re: [HACKERS] Optimizer confusion?
Previous Message Tom Lane 2000-08-12 00:35:03 Re: problem with float8 input format