Re: Query planner plans very inefficient plans

From: Sean Chittenden <sean(at)chittenden(dot)org>
To: Robert Wille <a2om6sy02(at)sneakemail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Query planner plans very inefficient plans
Date: 2003-06-30 21:13:36
Message-ID: 20030630211336.GF96753@perrin.int.nxad.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> I have somewhere around 3M rows in the image table, and 37K rows in the
> ancestry table. The following is representative of some of the common
> queries I issue:
>
> select * from image natural join ancestry where ancestorid=1000000 and
> (state & 7::bigint) = 0::bigint;
>
> When I ask postgres to EXPLAIN it, I get the following:
>
> Merge Join (cost=81858.22..81900.60 rows=124 width=49)
> -> Sort (cost=81693.15..81693.15 rows=16288 width=41)
> -> Seq Scan on image (cost=0.00..80279.17 rows=16288 width=41)
> -> Sort (cost=165.06..165.06 rows=45 width=8)
> -> Index Scan using ancestry_ancestorid_key on ancestry
> (cost=0.00..163.83 rows=45 width=8)
>
> It appears to me that the query executes as follows:
>
> 1. Scan every row in the image table to find those where (state &
> 7::bigint) = 0::bigint
> 2. Sort the results
> 3. Use an index on ancestry to find rows where ancestorid=1000000
> 4. Sort the results
> 5. Join the two

FWIW, I use INTs as bit vectors for options in various applications
and have run into this in a few cases. In the database, I only care
about a few bits in the options INT, so what I did was create a
function for each of the bits that I care about and then a function
index. Between the two, I've managed to solve my performance
problems.

CREATE FUNCTION app_option_foo_is_set(INT)
RETURNS BOOL
IMMUTABLE
AS '
BEGIN
IF $1 & 7::INT THEN
RETURN TRUE;
ELSE
RETURN FALSE;
END IF;
END;
' LANGUAGE 'plpgsql';
CREATE INDEX app_option_foo_fidx ON app_option_tbl (app_option_foo_is_set(options));
VACUUM ANALYZE;

Just make sure that you set your function to be IMMUTABLE. -sc

PS It'd be slick if PostgreSQL would collapse adjacent booleans into a
bit in a byte: it'd save some apps a chunk of space. 32 options ==
32 bytes with the type BOOL, but if adjacent BOOLs were collapsed,
it'd only be 4 bytes on disk and maybe some page header data.

--
Sean Chittenden

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Toby Sargeant 2003-07-01 03:19:23 excessive disk access during query
Previous Message Tom Lane 2003-06-30 19:05:26 Re: Query planner plans very inefficient plans