Suboptimal execution plan for simple query

From: "Markus Wollny" <Markus(dot)Wollny(at)computec(dot)de>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Suboptimal execution plan for simple query
Date: 2008-11-12 15:15:23
Message-ID: 28011CD60FB1724DBA4442E38277F6260AF987A3@hermes.computec.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi!

We've got a table with the following definition:

CREATE TABLE image_relation
(
id integer,
article_id integer NOT NULL,
entity_id integer NOT NULL,
image_id integer NOT NULL,
subline text,
"position" integer,
article_headline text,
entity_name text,
entity_type_id integer,
entity_source text,
default_pic character varying(3) NOT NULL,
last_updated timestamp without time zone NOT NULL DEFAULT now(),
CONSTRAINT pkey_image_relation PRIMARY KEY (article_id, entity_id, image_id)
)
WITH (OIDS=FALSE);

There are simple btree indexes on article_id, default_pic, entity_id, id, image_id, last_updated and position. The table has about 723,000 rows, stats say table size is 135MB, toast tables are 184MB and index size was at a whopping 727MB - so I thought I might do some additional maintenance. After reindexing, I got index size down to 131MB. This however did not affect the planner choices in any way, as they and the resulting execution times stayed the same before and after table maintenance (reindex and subsequent vacuum analyze). Our PostgreSQL version is 8.2.4 (I am going to move on to the latest and greatest 8.3 in about two weeks).

Now I've got this simple query

SELECT image_id
FROM image_relation
WHERE entity_id = 69560::integer
ORDER BY last_updated DESC
LIMIT 1;

which currently runs for something around 600ms. Here's the explain analyze output:

"Limit (cost=0.00..144.78 rows=1 width=12) (actual time=599.745..599.747 rows=1 loops=1)"
" -> Index Scan Backward using idx_image_relation_last_updated on image_relation (cost=0.00..39525.70 rows=273 width=12) (actual time=599.741..599.741 rows=1 loops=1)"
" Filter: (entity_id = 69560)"
"Total runtime: 599.825 ms"

SELECT image_id
FROM image_relation
WHERE entity_id = 69560::integer;

only returns three rows. So I wonder why the planner chooses to use the index on last_updated instead of the index on entity_id; I found out that I can get it to reconsider and make a wiser choice by adding some seemingly superfluous statement to the WHERE clause (notice the AND... bit):

SELECT image_id
FROM image_relation
WHERE entity_id = 69560
AND entity_id = entity_id
ORDER BY last_updated DESC
LIMIT 1

"Limit (cost=881.82..881.82 rows=1 width=12) (actual time=0.097..0.099 rows=1 loops=1)"
" -> Sort (cost=881.82..881.82 rows=1 width=12) (actual time=0.094..0.094 rows=1 loops=1)"
" Sort Key: last_updated"
" -> Index Scan using idx_image_relation_entity_id on image_relation (cost=0.00..881.81 rows=1 width=12) (actual time=0.063..0.075 rows=3 loops=1)"
" Index Cond: (entity_id = 69560)"
" Filter: (entity_id = entity_id)"
"Total runtime: 0.128 ms"

That's much more like it. The table is being vacuumed on a regular basis by both a nightly cron and the autovacuum daemon.

Any ideas on what's going wrong here?

Kind regards

Markus

Computec Media AG
Sitz der Gesellschaft und Registergericht: Fürth (HRB 8818)
Vorstandsmitglieder: Johannes S. Gözalan (Vorsitzender) und Rainer Rosenbusch
Vorsitzender des Aufsichtsrates: Jürg Marquard
Umsatzsteuer-Identifikationsnummer: DE 812 575 276

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alvaro Herrera 2008-11-12 15:19:23 Re: Table bloat and vacuum
Previous Message Jack Orenstein 2008-11-12 15:13:29 Table bloat and vacuum