View performance with implicit cast

From: Tomasz Szypowski <tomasz(dot)szypowski(at)asseco(dot)pl>
To: "pgsql-sql(at)lists(dot)postgresql(dot)org" <pgsql-sql(at)lists(dot)postgresql(dot)org>
Subject: View performance with implicit cast
Date: 2025-01-02 20:02:05
Message-ID: DU0PR04MB94192D3922E05C8E0A18F5F399142@DU0PR04MB9419.eurprd04.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Hello

I have got an example, in which PostgreSQL could speed up:

First, create and populate structures:

create table test1 (date_1 timestamp);
insert into test1
select generate_series( '2020-01-01', '2025-01-01', interval '1 minute');
CREATE INDEX idx_test1_date_1 on test1(date_1);

create table test2 (date_2 date);
insert into test2
select generate_series( '1900-01-01', '2200-01-01', interval '1 day');
CREATE INDEX idx_test2_date_2 on test2(date_2);

Then create a view:
create or replace view v_test as
select date_1 from test1
union all
select date_2 from test2

and now run the query:
explain analyze
select * from v_test
where (date_1>='2024-12-09') and (date_1<'2025-01-01');

You would see:
Subquery Scan on v_test (cost=0.00..95710.28 rows=13702 width=8) (actual time=440.998..472.586 rows=33143 loops=1)
Filter: ((v_test.date_1 >= '2024-12-09 00:00:00'::timestamp without time zone) AND (v_test.date_1 < '2025-01-01 00:00:00'::timestamp without time zone))
Rows Removed by Filter: 2707311
-> Append (cost=0.00..54603.47 rows=2740454 width=8) (actual time=0.038..332.601 rows=2740454 loops=1)
-> Seq Scan on test1 (cost=0.00..37950.81 rows=2630881 width=8) (actual time=0.035..165.235 rows=2630881 loops=1)
-> Subquery Scan on "*SELECT* 2" (cost=0.00..2950.39 rows=109573 width=8) (actual time=0.025..16.285 rows=109573 loops=1)
-> Seq Scan on test2 (cost=0.00..1580.73 rows=109573 width=4) (actual time=0.022..4.974 rows=109573 loops=1)
Planning Time: 0.263 ms
Execution Time: 473.598 ms

I can imagine, that test2 cannot be searched by index due to the implicite cast of the column date -> timestamp, but why the query doesn`t use index idx_test1_date_1? Furthermore it could use as well index idx_test2_date_2

Now let`s modify the view`s definition only to add explicit cast:
create or replace view v_test as
select date_1 from test1
union all
select date_2::timestamp from test2

Run the query again:
explain analyze
select * from v_test
where (date_1>='2024-12-09') and (date_1<'2025-01-01');

Append (cost=0.43..3812.87 rows=31903 width=8) (actual time=0.024..17.287 rows=33143 loops=1)
-> Index Only Scan using idx_test1_date_1 on test1 (cost=0.43..975.53 rows=31355 width=8) (actual time=0.023..5.500 rows=33120 loops=1)
Index Cond: ((date_1 >= '2024-12-09 00:00:00'::timestamp without time zone) AND (date_1 < '2025-01-01 00:00:00'::timestamp without time zone))
Heap Fetches: 0
-> Seq Scan on test2 (cost=0.00..2677.83 rows=548 width=8) (actual time=3.744..8.610 rows=23 loops=1)
Filter: (((date_2)::timestamp without time zone >= '2024-12-09 00:00:00'::timestamp without time zone) AND ((date_2)::timestamp without time zone < '2025-01-01 00:00:00'::timestamp without time zone))
Rows Removed by Filter: 109550
Planning Time: 0.230 ms
Execution Time: 18.937 ms

So

1. Why the index is being used only when the column types are the same?
2. Why th other indexes are not being used?

Maybe you can help me to rewrite the query

Regards
Thomas Szypowski

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Tom Lane 2025-01-04 18:37:29 Re: View performance with implicit cast
Previous Message shammat 2024-12-20 15:25:39 Re: Abitity to identify the current iteration in a recursive SELECT (feature request)