Bad estimate on LIKE matching

From: "Magnus Hagander" <mha(at)sollentuna(dot)net>
To: "pgsql-hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Bad estimate on LIKE matching
Date: 2006-01-17 12:53:22
Message-ID: 6BCB9D8A16AC4241919521715F4D8BCE6C7EB2@algol.sollentuna.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I have a table, "path", which is:

pathid | integer | not null default
nextval('path_pathid_seq'::regclass)
path | text | not null
Indexes:
"path_pkey" PRIMARY KEY, btree (pathid)
"path_name_idx" btree (path)

The table contains approx 1.2 million rows, of which all are unique.
(both the path field and the naem field are unique, thought he
path_name_idx index is not a unique index)

On this table, I do a query like:
SELECT * FROM path WHERE path LIKE 'f:/userdirs/s/super_73/%'

The estimate for this query is comlpetely off, which I beleive is the
cause for a very bad selection of a query plan when it's used in a big
join (creating nestloops that ends up taking 15+ minutes to complete..).

Explain analyze gives:
QUERY PLAN
------------------------------------------------------------------------
-----------------------------------------------
Index Scan using path_name_idx on path (cost=0.00..3.24 rows=1
width=74) (actual time=0.035..0.442 rows=214 loops=1)
Index Cond: ((path >= 'f:/userdirs/s/super'::text) AND (path <
'f:/userdirs/s/supes'::text))
Filter: (path ~~ 'f:/userdirs/s/super_73%'::text)

No matter what I search on (when it's very selective), the estimate is
always 1 row, whereas the actual value is at least a couple of hundred.
If I try with say "f:/us", the difference is 377,759 estimated vs
562,459 returned, which is percentage-wise a lot less, but...

I have tried upping the statistics target up to 1000, with no changes.

Any way to teach the planner about this?

FYI, if I push the result of the select on path into a temp table, and
then join with that one in my main table, I get a hashjoin instead, and
query time is < 30 seconds instead of 15 minutes...

//Magnus

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Schuchardt 2006-01-17 14:44:15 Error working with Temporary Sequences in plpgsql in 8.1 (8.0 works fine)
Previous Message Andrew Dunstan 2006-01-17 12:42:39 Re: [HACKERS] source documentation tool doxygen