planner with index scan cost way off actual cost, advices to tweak cost constants?

From: Guillaume Cottenceau <gc(at)mnc(dot)ch>
To: pgsql-performance(at)postgresql(dot)org
Subject: planner with index scan cost way off actual cost, advices to tweak cost constants?
Date: 2006-03-17 10:09:50
Message-ID: 87ek11gz69.fsf@meuh.mnc.lan
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

I have a problem with the postgres planner, which gives a cost to
index scan which is much higher than actual cost (worst case
considered, e.g. without any previous disk cache), and am posting
here for advices for tweaking cost constants. Because of this
problem, the planner typically chooses a seq scan when an index
scan would be more efficient, and I would like to correct this if
possible.

Reading the documentation and postgresql list archives, I have
run ANALYZE right before my tests, I have increased the
statistics target to 50 for the considered table; my problem is
that the index scan cost reported by EXPLAIN seems to be around
12.7 times higher that it should, a figure I suppose incompatible
(too large) for just random_page_cost and effective_cache_size
tweaks.

Structure of the table:

\d sent_messages
Table "public.sent_messages"
Column | Type | Modifiers
----------+--------------------------+----------------------------------------------------------------
uid | integer | not null default nextval('public.sent_messages_uid_seq'::text)
sender | character varying(25) |
receiver | character varying(25) |
action | character varying(25) |
cost | integer |
date | timestamp with time zone | not null default ('now'::text)::timestamp(6) with time zone
status | character varying(128) |
theme | character varying(25) |
operator | character varying(15) |
Indexes:
"sent_messages_pkey" primary key, btree (uid)
"idx_sent_msgs_date_theme_status" btree (date, theme, status)

What I did:

- SET default_statistics_target = 50

- VACUUM FULL ANALYZE VERBOSE sent_messages - copied so that you
can have a look at rows and pages taken up by relations

INFO: vacuuming "public.sent_messages"
INFO: "sent_messages": found 0 removable, 3692284 nonremovable row versions in 55207 pages
DETAIL: 0 dead row versions cannot be removed yet.
Nonremovable row versions range from 103 to 177 bytes long.
There were 150468 unused item pointers.
Total free space (including removable row versions) is 2507320 bytes.
0 pages are or will become empty, including 0 at the end of the table.
2469 pages containing 262732 free bytes are potential move destinations.
CPU 0.57s/0.20u sec elapsed 11.27 sec.
INFO: index "sent_messages_pkey" now contains 3692284 row versions in 57473 pages
DETAIL: 0 index row versions were removed.
318 index pages have been deleted, 318 are currently reusable.
CPU 2.80s/1.27u sec elapsed 112.69 sec.
INFO: index "idx_sent_msgs_date_theme_status" now contains 3692284 row versions in 88057 pages
DETAIL: 0 index row versions were removed.
979 index pages have been deleted, 979 are currently reusable.
CPU 4.22s/1.51u sec elapsed 246.88 sec.
INFO: "sent_messages": moved 0 row versions, truncated 55207 to 55207 pages
DETAIL: CPU 1.87s/3.18u sec elapsed 42.71 sec.
INFO: vacuuming "pg_toast.pg_toast_77852470"
INFO: "pg_toast_77852470": found 0 removable, 0 nonremovable row versions in 0 pages
DETAIL: 0 dead row versions cannot be removed yet.
Nonremovable row versions range from 0 to 0 bytes long.
There were 0 unused item pointers.
Total free space (including removable row versions) is 0 bytes.
0 pages are or will become empty, including 0 at the end of the table.
0 pages containing 0 free bytes are potential move destinations.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: index "pg_toast_77852470_index" now contains 0 row versions in 1 pages
DETAIL: 0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.02 sec.
INFO: analyzing "public.sent_messages"
INFO: "sent_messages": 55207 pages, 15000 rows sampled, 3666236 estimated total rows

- select rows of the table with a range condition on "date", find
a range for which seq scan and index scan runtimes seem to be
very close (I use Linux, I cat a 2G file to /dev/null between
each request to flush disk cache, on a machine of 1G real RAM
and 1G of swap, so that this is the worst case tested for index
scan), notice that the cost used by the planner is 12.67 times
higher for index scan, at a position it should be around 1 so
that planner could make sensible choices:

EXPLAIN ANALYZE SELECT * FROM sent_messages WHERE date > '2005-09-01' AND date < '2005-09-19';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on sent_messages (cost=0.00..110591.26 rows=392066 width=78) (actual time=7513.205..13095.147 rows=393074 loops=1)
Filter: ((date > '2005-09-01 00:00:00+00'::timestamp with time zone) AND (date < '2005-09-19 00:00:00+00'::timestamp with time zone))
Total runtime: 14272.522 ms

SET enable_seqscan = false

EXPLAIN ANALYZE SELECT * FROM sent_messages WHERE date > '2005-09-01' AND date < '2005-09-19';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using idx_sent_msgs_date_theme_status on sent_messages (cost=0.00..1402124.26 rows=392066 width=78) (actual time=142.638..12677.378 rows=393074 loops=1)
Index Cond: ((date > '2005-09-01 00:00:00+00'::timestamp with time zone) AND (date < '2005-09-19 00:00:00+00'::timestamp with time zone))
Total runtime: 13846.504 ms

Please notice that an index on the "date" column only would be
much more efficient for the considered request (and I have
confirmed this by creating and trying it), but I don't
necessarily would need this index if the existing index was
used. Of course real queries use smaller date ranges.

- I then tried to tweak random_page_cost and effective_cache_size
following advices from documentation:

SET random_page_cost = 2;
SET effective_cache_size = 10000;
EXPLAIN SELECT * FROM sent_messages WHERE date > '2005-09-01' AND date < '2005-09-19';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using idx_sent_msgs_date_theme_status on sent_messages (cost=0.00..595894.94 rows=392066 width=78)
Index Cond: ((date > '2005-09-01 00:00:00+00'::timestamp with time zone) AND (date < '2005-09-19 00:00:00+00'::timestamp with time zone))

We can see that estimated index scan cost goes down but by a
factor of approx. 2.3 which is far from enough to "fix" it. I
am reluctant in changing way more the random_page_cost and
effective_cache_size values as I'm suspecting it might have
other (bad) consequences if it is too far away from reality
(even if Linux is known to aggressively cache), the application
being multithreaded (there is a warning about concurrent
queries using different indexes in documentation). But I
certainly could benefit from others' experience on this matter.

I apologize for this long email but I wanted to be sure I gave
enough information on the data and things I have tried to fix the
problem myself. If anyone can see what I am doing wrong, I would
be very interested in pointers.

Thanks in advance!

Btw, I use postgres 7.4.5 with -B 1000 -N 500 and all
postgresql.conf default values except timezone = 'UTC', on an
ext3 partition with data=ordered, and run Linux 2.6.12.

--
Guillaume Cottenceau

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Evgeny Gridasov 2006-03-17 12:24:48 Re: Background writer configuration
Previous Message Luke Lonergan 2006-03-17 07:24:00 Re: 1 TB of memory