Missing constant propagation in planner on hash quals causes join slowdown

From: Hans Buschmann <buschmann(at)nidsa(dot)net>
To: "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Missing constant propagation in planner on hash quals causes join slowdown
Date: 2019-10-18 15:40:34
Message-ID: 1571413123735.26467@nidsa.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Hackers,

By optimising our application I stumbled over the join quals used very often in our application.
In general this concerns datasets, which are subdivided into chunks, like years, seasons (here half a year), multiple tenants in OLTP system etc.
In these cases many tables are joined only to data of the same chunk, identified always by a separating column in the table (here: xx_season).

(I tested it on PG 12.0 on Windows 64 bit, but similar results on older stable releases and other OS)

Here is the test case, also in the attached file:
(I choose to join 2 tables with 2 seasons (2 and 3) of about 1 million of records for every season. I put some randomness in the table creation to simulate the normal situation in OLTP systems)

----------------------------------- Source start

drop table if exists tmaster;

create table tmaster (
id_t1 integer,
t1_season integer,
t1_id_t2 integer,
t1_value integer,
t1_cdescr varchar,
primary key (id_t1)
);

--

select setseed (0.34512);

insert into tmaster
select
inum
,iseason
,row_number () over () as irow
,irandom
,'TXT: '||irandom::varchar
from (
select
inum::integer
,((inum>>20)+2)::integer as iseason
,inum::integer + (500000*random())::integer as irandom
from generate_series (1,(1<<21)) as inum
order by irandom
)qg
;

alter table tmaster add constraint uk_master_season_id unique (t1_season,id_t1);

drop table if exists tfact;

create table tfact (
id_t2 integer,
t2_season integer,
t2_value integer,
t2_cdescr varchar,
primary key (id_t2)
);

--

select setseed (-0.76543);

insert into tfact
select
qg.*
,'FKT: '||irandom::varchar
from (
select
inum::integer
,((inum>>20)+2)::integer as iseason
,inum::integer + (500000*random())::integer as irandom
from generate_series (1,(1<<21)) as inum
order by irandom
)qg
;

alter table tfact add constraint uk_fact_season_id unique (t2_season,id_t2);

-----------------

-- slower:

explain (analyze, verbose, costs, settings, buffers)
select *
from tmaster
left join tfact on id_t2=t1_id_t2 and t2_season=t1_season
where t1_season=3
;

-- faster by setting a constant in left join on condition:

explain (analyze, verbose, costs, settings, buffers)
select *
from tmaster
left join tfact on id_t2=t1_id_t2 and t2_season=3 --t1_season
where t1_season=3
;

----------------------------------- Source end

The results for the first query:

explain (analyze, verbose, costs, settings, buffers)
select *
from tmaster
left join tfact on id_t2=t1_id_t2 and t2_season=t1_season
where t1_season=3
;

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Left Join (cost=53436.01..111610.15 rows=1046129 width=52) (actual time=822.784..2476.573 rows=1048576 loops=1)
Output: tmaster.id_t1, tmaster.t1_season, tmaster.t1_id_t2, tmaster.t1_value, tmaster.t1_cdescr, tfact.id_t2, tfact.t2_season, tfact.t2_value, tfact.t2_cdescr
Inner Unique: true
Hash Cond: ((tmaster.t1_season = tfact.t2_season) AND (tmaster.t1_id_t2 = tfact.id_t2))
Buffers: shared hit=2102193, temp read=10442 written=10442
-> Index Scan using uk_master_season_id on public.tmaster (cost=0.43..32263.38 rows=1046129 width=28) (actual time=0.008..565.222 rows=1048576 loops=1)
Output: tmaster.id_t1, tmaster.t1_season, tmaster.t1_id_t2, tmaster.t1_value, tmaster.t1_cdescr
Index Cond: (tmaster.t1_season = 3)
Buffers: shared hit=1051086
-> Hash (cost=31668.49..31668.49 rows=1043473 width=24) (actual time=820.960..820.961 rows=1048576 loops=1)
Output: tfact.id_t2, tfact.t2_season, tfact.t2_value, tfact.t2_cdescr
Buckets: 524288 (originally 524288) Batches: 4 (originally 2) Memory Usage: 28673kB
Buffers: shared hit=1051107, temp written=4316
-> Index Scan using uk_fact_season_id on public.tfact (cost=0.43..31668.49 rows=1043473 width=24) (actual time=0.024..598.648 rows=1048576 loops=1)
Output: tfact.id_t2, tfact.t2_season, tfact.t2_value, tfact.t2_cdescr
Index Cond: (tfact.t2_season = 3)
Buffers: shared hit=1051107
Settings: effective_cache_size = '8GB', random_page_cost = '1', temp_buffers = '32MB', work_mem = '32MB'
Planning Time: 0.627 ms
Execution Time: 2502.702 ms
(20 rows)

and for the second one:

explain (analyze, verbose, costs, settings, buffers)
select *
from tmaster
left join tfact on id_t2=t1_id_t2 and t2_season=3 --t1_season
where t1_season=3
;

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Left Join (cost=50827.33..106255.38 rows=1046129 width=52) (actual time=758.086..2313.175 rows=1048576 loops=1)
Output: tmaster.id_t1, tmaster.t1_season, tmaster.t1_id_t2, tmaster.t1_value, tmaster.t1_cdescr, tfact.id_t2, tfact.t2_season, tfact.t2_value, tfact.t2_cdescr
Inner Unique: true
Hash Cond: (tmaster.t1_id_t2 = tfact.id_t2)
Buffers: shared hit=2102193, temp read=9024 written=9024
-> Index Scan using uk_master_season_id on public.tmaster (cost=0.43..32263.38 rows=1046129 width=28) (actual time=0.009..549.793 rows=1048576 loops=1)
Output: tmaster.id_t1, tmaster.t1_season, tmaster.t1_id_t2, tmaster.t1_value, tmaster.t1_cdescr
Index Cond: (tmaster.t1_season = 3)
Buffers: shared hit=1051086
-> Hash (cost=31668.49..31668.49 rows=1043473 width=24) (actual time=756.125..756.125 rows=1048576 loops=1)
Output: tfact.id_t2, tfact.t2_season, tfact.t2_value, tfact.t2_cdescr
Buckets: 524288 Batches: 4 Memory Usage: 18711kB
Buffers: shared hit=1051107, temp written=4317
-> Index Scan using uk_fact_season_id on public.tfact (cost=0.43..31668.49 rows=1043473 width=24) (actual time=0.025..584.652 rows=1048576 loops=1)
Output: tfact.id_t2, tfact.t2_season, tfact.t2_value, tfact.t2_cdescr
Index Cond: (tfact.t2_season = 3)
Buffers: shared hit=1051107
Settings: effective_cache_size = '8GB', random_page_cost = '1', temp_buffers = '32MB', work_mem = '32MB'
Planning Time: 0.290 ms
Execution Time: 2339.651 ms
(20 rows)

By replacing the =t1_season with =3 the query took about 160 ms less or about 7 percent.

Both queries are logically equivalent. The planner correctly identifies the Index Cond: (tfact.t2_season = 3) for selecting from the index uk_fact_season_id.
But in the slower query the outer hash condition still hashes with the column t1_season and t2_season as in
Hash Cond: ((tmaster.t1_season = tfact.t2_season) AND (tmaster.t1_id_t2 = tfact.id_t2)).
This can only be detected with explain analyze verbose when the hash cond are shown.

The first query notation with and t2_season=t1_season is much more natural, as it requires only one numerical constant to get good query Speed (often many fact tables are joined).

The inclusion of the xx_season quals reduces the processed dataset and helps also when the seasons columns are used for list partitioning of all the involved tables.
When omitting it, the whole fact table will be joined.

To me it seems that the "constantness" is not propagated to all equivalence columns and not considered in hash joining.

Unfortunely I am not in the position to write a patch, so I would appreciate any help to get this optimization realized.

Much thanks in advance

Hans Buschmann

Attachment Content-Type Size
jointest.sql text/plain 1.8 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2019-10-18 16:15:02 Re: Can you please tell us how set this prefetch attribute in following lines.
Previous Message Tomas Vondra 2019-10-18 15:00:54 Re: Bug about drop index concurrently