like & optimization

From: Scott Ribe <scott_ribe(at)elevated-dev(dot)com>
To: Postgresql General <pgsql-general(at)postgresql(dot)org>
Subject: like & optimization
Date: 2013-10-12 18:08:25
Message-ID: 49BBFE58-665D-4AA8-8658-5B23D4A14B8E@elevated-dev.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

PG 9.3, consider a table test like:

tz timestamp not null,
cola varchar not null,
colb varchar not null

2 compound indexes:

tz_cola on (tz, cola)
tz_colb on (tz, colb varchar_pattern_ops)

now a query, for some start & end timestamps:

select * from test where tz >= start and tz < end and colb like '%foobar%'

Assume that the tz restriction is somewhat selective, say 1% of the table, and the colb restriction is extremely selective, say less than 0.00001%.

It seems to me that the fastest way to resolve this query is to use the tz_colb index directly, scanning the range between tz >= start and tz < end for the colb condition.

But pg wants to use the pg_cola index to find all rows in the time range, then filter those rows for the colb condition. (FYI, cola contains only very small values, while colb's values are typically several times longer.)

Now if I tweak the time range, I can get it to seq scan the table for all conditions, or bitmap heap scan + re-check cond tz + filter colb + bitmap index scan tz_cola, but never use the tz_colb index...

Am I right about the fastest way to perform the search? Is there some way to get pg to do this, or would this require an enhancement?

Here's a sample query plan:

Index Scan using tz_cola on test (cost=0.56..355622.52 rows=23 width=106) (actual time=61.403..230.649 rows=4 loops=1)
Index Cond: ((tz >= '2013-04-01 06:00:00-05'::timestamp with time zone) AND (tz <= '2013-04-30 06:00:00-05'::timestamp with time zone))
Filter: ((colb)::text ~~ '%foobar%'::text)
Rows Removed by Filter: 261725
Total runtime: 230.689 ms

--
Scott Ribe
scott_ribe(at)elevated-dev(dot)com
http://www.elevated-dev.com/
(303) 722-0567 voice

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Vincent Veyron 2013-10-12 18:11:51 Re: Forms for entering data into postgresql
Previous Message Pavel Stehule 2013-10-12 17:34:52 Re: PSQL: argument exceeds the maximum length of 1024 bytes