Re: select on index column,why PG still use seq scan?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Xiaoyulei <xiaoyulei(at)huawei(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>, yetao <yetao1(at)huawei(dot)com>
Subject: Re: select on index column,why PG still use seq scan?
Date: 2014-08-18 03:49:42
Message-ID: 24243.1408333782@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Xiaoyulei <xiaoyulei(at)huawei(dot)com> writes:
> I create a table and insert some items.
> I create index on every column.
> And I execute select, I thought it should use index scan, but it is still seq scan. Why PG do not use index scan?

> create table v_org_info(
> org_no varchar2(8), org_nm varchar2(80),
> org_no_l1 varchar2(8), org_nm_l1 varchar2(80),
> org_no_l2 varchar2(8), org_nm_l2 varchar2(80)
> );

There is no "varchar2" type in Postgres. I tried this example with
"varchar" in place of that, but when I got to

> POSTGRES=# explain analyze select a.org_nm from v_org_info a where a.org_no = 1000;

I got

ERROR: operator does not exist: character varying = integer
LINE 1: ...ze select a.org_nm from v_org_info a where a.org_no = 1000;
^
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.

which is certainly what I *should* get. I changed it to

explain analyze select a.org_nm from v_org_info a where a.org_no = '1000';

and then I got

Bitmap Heap Scan on v_org_info a (cost=4.49..74.90 rows=27 width=58) (actual t
ime=0.044..0.044 rows=1 loops=1)
Recheck Cond: ((org_no)::text = '1000'::text)
Heap Blocks: exact=1
-> Bitmap Index Scan on idx_v_org_info_org_no (cost=0.00..4.48 rows=27 widt
h=0) (actual time=0.020..0.020 rows=1 loops=1)
Index Cond: ((org_no)::text = '1000'::text)
Planning time: 0.481 ms
Execution time: 0.104 ms

which is OK, but after "ANALYZE v_org_info" I got

Index Scan using idx_v_org_info_org_no on v_org_info a (cost=0.29..8.30 rows=1
width=5) (actual time=0.019..0.020 rows=1 loops=1)
Index Cond: ((org_no)::text = '1000'::text)
Planning time: 0.372 ms
Execution time: 0.060 ms

which is better.

> Seq Scan on V_ORG_INFO A (cost=0.00..189.97 rows=9 width=178, batch_size=100) (actual time=0.930..18.034 rows=1 loops=1)
> Filter: (INT4IN(VARCHAROUT(ORG_NO)) = 1000)
> Rows Removed by Filter: 19999
> Total runtime: 18.099 ms
> (4 rows)

TBH, this looks like some incompetently hacked-up variant of Postgres;
certainly no version ever shipped by the core project would have done
this. It looks like somebody tried to make cross-type comparisons work by
inserting conversion operations, but they did it in such a way that the
conversions were applied to the column not the constant. An index on
org_no isn't going to help you for a query on INT4IN(VARCHAROUT(ORG_NO)).
(And I wonder why exactly the names are printing as upper case here ...)

regards, tom lane

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Dave Owens 2014-08-18 21:01:30 query against pg_locks leads to large memory alloc
Previous Message Xiaoyulei 2014-08-18 03:19:22 select on index column,why PG still use seq scan?