Re: string casting for index usage

From: Michael Adler <adler(at)pobox(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: string casting for index usage
Date: 2004-03-19 22:22:17
Message-ID: 20040319222217.GA14902@pobox.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, Mar 18, 2004 at 03:39:12PM -0500, Tom Lane wrote:
> Michael Adler <adler(at)pobox(dot)com> writes:
> > In porting an application from v7.2 and v7.3, I noticed that a join on a varchar column and a text column was ignoring indices that were helpful in v7.2. When I explicitly cast the text to a varchar (or set ENABLE_SEQSCAN TO false) the index is scanned and it works as efficiently as in v7.2.
>
> Maybe you should be moving to 7.4, instead.

That's a fair suggestion, but it's not practical for our 75 sites, most without decent network access. If this is in fact addressed in newer releases, then my point is mostly inconsequential.

We use Debian stable (7.2.1-2woody4) and Debian testing (7.3.4-9).

> A desultory test didn't show any difference between 7.2.4 and 7.3.6
> in this respect, however. Perhaps you forgot to ANALYZE yet in the
> new database?

I have a test with sample data and queries to demonstrate what I'm seeing. I hope it is useful.

Having to do manual casts is not cruel and unusual, but it's not encouraging to see performance go down after an upgrade. If anyone has any clever solutions, let me know.

tables, data, and queries:
http://www.panix.com/~adler/manual-cast-for-index-scan.sql

my test output:
http://www.panix.com/~adler/manual-cast-for-index-scan_7.3.4-9.out
http://www.panix.com/~adler/manual-cast-for-index-scan_7.2.1-2woody4.out

(the times are not horrific in these specific examples, but the sequential scan makes them unscalable).

manual-cast-for-index-scan_7.3.4-9.out:

DROP TABLE t1;
DROP TABLE
DROP TABLE t2;
DROP TABLE
CREATE TABLE t1 (
key_col text,
grp text
);
CREATE TABLE
COPY t1 FROM stdin;
CREATE UNIQUE INDEX tempindex1 ON t1 USING btree (key_col);
CREATE INDEX
CREATE TABLE t2 (
item_num character varying(5),
key_col character varying(14)
);
CREATE TABLE
COPY t2 FROM stdin;
CREATE INDEX tempindex2 ON t2 USING btree (key_col);
CREATE INDEX
VACUUM ANALYZE;
VACUUM
SELECT version();
PostgreSQL 7.3.4 on i386-pc-linux-gnu, compiled by GCC i386-linux-gcc (GCC) 3.3.2 (Debian)

EXPLAIN ANALYZE SELECT item_num, t1.key_col FROM t1 LEFT JOIN t2 ON (t2.key_col = t1.key_col) WHERE grp = '24';
Nested Loop (cost=0.00..23803.27 rows=194 width=31) (actual time=20.95..1401.46 rows=69 loops=1)
Join Filter: (("inner".key_col)::text = "outer".key_col)
-> Seq Scan on t1 (cost=0.00..492.94 rows=194 width=18) (actual time=0.32..30.27 rows=69 loops=1)
Filter: (grp = '24'::text)
-> Seq Scan on t2 (cost=0.00..66.87 rows=4287 width=13) (actual time=0.01..12.06 rows=4287 loops=69)
Total runtime: 1401.73 msec

EXPLAIN ANALYZE SELECT item_num, t1.key_col FROM t1 LEFT JOIN t2 ON (t2.key_col::text = t1.key_col) WHERE grp = '24';
Nested Loop (cost=0.00..23803.27 rows=194 width=31) (actual time=20.27..1398.82 rows=69 loops=1)
Join Filter: (("inner".key_col)::text = "outer".key_col)
-> Seq Scan on t1 (cost=0.00..492.94 rows=194 width=18) (actual time=0.26..25.91 rows=69 loops=1)
Filter: (grp = '24'::text)
-> Seq Scan on t2 (cost=0.00..66.87 rows=4287 width=13) (actual time=0.01..12.02 rows=4287 loops=69)
Total runtime: 1399.08 msec

EXPLAIN ANALYZE SELECT item_num, t1.key_col FROM t1 LEFT JOIN t2 ON (t2.key_col = t1.key_col::varchar(24)) WHERE grp = '24';
Nested Loop (cost=0.00..4819.13 rows=194 width=31) (actual time=0.52..27.46 rows=69 loops=1)
-> Seq Scan on t1 (cost=0.00..492.94 rows=194 width=18) (actual time=0.27..25.94 rows=69 loops=1)
Filter: (grp = '24'::text)
-> Index Scan using tempindex2 on t2 (cost=0.00..22.17 rows=12 width=13) (actual time=0.01..0.01 rows=0 loops=69)
Index Cond: (t2.key_col = ("outer".key_col)::character varying(24))
Total runtime: 27.70 msec

manual-cast-for-index-scan_7.2.1-2woody4.out:

DROP TABLE t1;
DROP
DROP TABLE t2;
DROP
CREATE TABLE t1 (
key_col text,
grp text
);
CREATE
COPY t1 FROM stdin;
CREATE UNIQUE INDEX tempindex1 ON t1 USING btree (key_col);
CREATE
CREATE TABLE t2 (
item_num character varying(5),
key_col character varying(14)
);
CREATE
COPY t2 FROM stdin;
CREATE INDEX tempindex2 ON t2 USING btree (key_col);
CREATE
VACUUM ANALYZE;
VACUUM
SELECT version();
PostgreSQL 7.2.1 on i686-pc-linux-gnu, compiled by GCC 2.95.4

EXPLAIN ANALYZE SELECT item_num, t1.key_col FROM t1 LEFT JOIN t2 ON (t2.key_col = t1.key_col) WHERE grp = '24';
psql:castedneed.sql:29127: NOTICE: QUERY PLAN:

Nested Loop (cost=0.00..1405.88 rows=204 width=32) (actual time=0.46..40.60 rows=69 loops=1)
-> Seq Scan on t1 (cost=0.00..505.94 rows=204 width=18) (actual time=0.35..39.09 rows=69 loops=1)
-> Index Scan using tempindex2 on t2 (cost=0.00..4.27 rows=11 width=14) (actual time=0.01..0.01 rows=0 loops=69)
Total runtime: 40.81 msec

EXPLAIN
EXPLAIN ANALYZE SELECT item_num, t1.key_col FROM t1 LEFT JOIN t2 ON (t2.key_col::text = t1.key_col) WHERE grp = '24';
psql:castedneed.sql:29128: NOTICE: QUERY PLAN:

Nested Loop (cost=0.00..1405.88 rows=204 width=32) (actual time=0.40..39.88 rows=69 loops=1)
-> Seq Scan on t1 (cost=0.00..505.94 rows=204 width=18) (actual time=0.35..38.44 rows=69 loops=1)
-> Index Scan using tempindex2 on t2 (cost=0.00..4.27 rows=11 width=14) (actual time=0.01..0.01 rows=0 loops=69)
Total runtime: 40.07 msec

EXPLAIN
EXPLAIN ANALYZE SELECT item_num, t1.key_col FROM t1 LEFT JOIN t2 ON (t2.key_col = t1.key_col::varchar(24)) WHERE grp = '24';
psql:castedneed.sql:29129: NOTICE: QUERY PLAN:

Nested Loop (cost=0.00..1416.66 rows=4383 width=32) (actual time=0.40..41.59 rows=69 loops=1)
-> Seq Scan on t1 (cost=0.00..505.94 rows=204 width=18) (actual time=0.36..40.05 rows=69 loops=1)
-> Index Scan using tempindex2 on t2 (cost=0.00..4.30 rows=11 width=14) (actual time=0.01..0.01 rows=0 loops=69)
Total runtime: 41.78 msec

EXPLAIN

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2004-03-19 23:28:50 Re: string casting for index usage
Previous Message Tom Lane 2004-03-19 06:18:20 Re: vacuum performance