From: | Zhenghua Lyu <zlyu(at)vmware(dot)com> |
---|---|
To: | "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | distribute_restrictinfo_to_rels if restrictinfo contains volatile functions |
Date: | 2020-07-10 04:23:04 |
Message-ID: | SN6PR05MB455934DF8807E6A672E1A1DEB5650@SN6PR05MB4559.namprd05.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi all,
consider the following SQL:
================================================================================================
gpadmin=# explain (verbose, costs off)
select * from t,
(select a from generate_series(1, 1)a)x,
(select a from generate_series(1, 1)a)y
where ((x.a+y.a)/4.0) > random();
QUERY PLAN
----------------------------------------------------------------------------------------
Nested Loop
Output: t.a, t.b, a.a, a_1.a
-> Nested Loop
Output: a.a, a_1.a
Join Filter: (((((a.a + a_1.a))::numeric / 4.0))::double precision > random())
-> Function Scan on pg_catalog.generate_series a
Output: a.a
Function Call: generate_series(1, 1)
-> Function Scan on pg_catalog.generate_series a_1
Output: a_1.a
Function Call: generate_series(1, 1)
-> Seq Scan on public.t
Output: t.a, t.b
(13 rows)
================================================================================================
The where clause is "pushed down to the x,y" because it only references these two relations.
The original query tree's join tree is like:
FromExpr []
[fromlist]
RangeTblRef [rtindex=1]
RangeTblRef [rtindex=4]
RangeTblRef [rtindex=5]
[quals]
OpExpr [opno=674 opfuncid=297 opresulttype=16 opretset=false]
FuncExpr [funcid=1746 funcresulttype=701 funcretset=false funcvariadic=false
funcformat=COERCE_IMPLICIT_CAST]
OpExpr [opno=1761 opfuncid=1727 opresulttype=1700 opretset=false]
FuncExpr [funcid=1740 funcresulttype=1700 funcretset=false funcvariadic=false
funcformat=COERCE_IMPLICIT_CAST]
OpExpr [opno=551 opfuncid=177 opresulttype=23 opretset=false]
Var [varno=4 varattno=1 vartype=23 varnoold=4 varoattno=1]
Var [varno=5 varattno=1 vartype=23 varnoold=5 varoattno=1]
Const [consttype=1700 constlen=-1 constvalue=94908966309104 constisnull=false
constbyval=false]
FuncExpr [funcid=1598 funcresulttype=701 funcretset=false funcvariadic=false
funcformat=COERCE_EXPLICIT_CALL]
It seems the semantics it wants to express is: filter after join all the tables.
Thus maybe a plan like
Nested Loop
Join Filter: (((((a.a + a_1.a))::numeric / 4.0))::double precision > random())
-> Nested Loop
-> Function Scan on generate_series a
-> Function Scan on generate_series a_1
-> Seq Scan on t (cost=0.00..32.60 rows=2260 width=8)
May also be reasonable because it is just the direct translation from the original query tree.
The above plans may have different property:
* the first one, if we push down, can only produce 2 results: 0 rows, or 10 rows. No third possibility
* the second one, will output 0 ~ 10 rows with equal probability.
I am wondering if we should consider volatile functions in restrictinfo when try to distribute_restrictinfo_to_rels?
Best,
Zhenghua Lyu
From | Date | Subject | |
---|---|---|---|
Next Message | vignesh C | 2020-07-10 04:28:28 | Re: Added tab completion for the missing options in copy statement |
Previous Message | Ajin Cherian | 2020-07-10 03:51:08 | Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions |