Oversight in reparameterize_path_by_child leading to executor crash

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Oversight in reparameterize_path_by_child leading to executor crash
Date: 2023-08-01 08:44:27
Message-ID: CAMbWs496+N=UAjOc=rcD3P7B6oJe4rZw08e_TZRUsWbPxZW3Tw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

For paths of type 'T_Path', reparameterize_path_by_child just does the
flat-copy but does not adjust the expressions that have lateral
references. This would have problems for partitionwise-join. As an
example, consider

regression=# explain (costs off)
select * from prt1 t1 join lateral
(select * from prt1 t2 TABLESAMPLE SYSTEM (t1.a)) s
on t1.a = s.a;
QUERY PLAN
-----------------------------------------
Append
-> Nested Loop
-> Seq Scan on prt1_p1 t1_1
-> Sample Scan on prt1_p1 t2_1
Sampling: system (t1.a)
Filter: (t1_1.a = a)
-> Nested Loop
-> Seq Scan on prt1_p2 t1_2
-> Sample Scan on prt1_p2 t2_2
Sampling: system (t1.a)
Filter: (t1_2.a = a)
-> Nested Loop
-> Seq Scan on prt1_p3 t1_3
-> Sample Scan on prt1_p3 t2_3
Sampling: system (t1.a)
Filter: (t1_3.a = a)
(16 rows)

Note that the lateral references in the sampling info are not
reparameterized correctly. They are supposed to reference the child
relations, but as we can see from the plan they are still referencing
the top parent relation. Running this plan would lead to executor
crash.

regression=# explain (analyze, costs off)
select * from prt1 t1 join lateral
(select * from prt1 t2 TABLESAMPLE SYSTEM (t1.a)) s
on t1.a = s.a;
server closed the connection unexpectedly

In this case what we need to do is to adjust the TableSampleClause to
refer to the correct child relations. We can do that with the help of
adjust_appendrel_attrs_multilevel(). One problem is that the
TableSampleClause is stored in RangeTblEntry, and it does not seem like
a good practice to alter RangeTblEntry in this place. What if we end up
choosing the non-partitionwise-join path as the cheapest one? In that
case altering the RangeTblEntry here would cause a problem of the
opposite side: the lateral references in TableSampleClause should refer
to the top parent relation but they are referring to the child
relations.

So what I'm thinking is that maybe we can add a new type of path, named
SampleScanPath, to have the TableSampleClause per path. Then we can
safely reparameterize the TableSampleClause as needed for each
SampleScanPath. That's what the attached patch does.

There are some other plan types that do not have a separate path type
but may have lateral references in expressions stored in RangeTblEntry,
such as FunctionScan, TableFuncScan and ValuesScan. But it's not clear
to me if there are such problems with them too.

Thanks
Richard

Attachment Content-Type Size
v1-0001-Fix-reparameterize_path_by_child-for-SampleScan.patch application/octet-stream 16.3 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message José Neves 2023-08-01 09:13:47 RE: CDC/ETL system on top of logical replication with pgoutput, custom client
Previous Message Masahiko Sawada 2023-08-01 08:36:02 Re: Inaccurate comments in ReorderBufferCheckMemoryLimit()