Re: A new strategy for pull-up correlated ANY_SUBLINK

From: Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>
To: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, vignesh C <vignesh21(at)gmail(dot)com>, Richard Guo <guofenglinux(at)gmail(dot)com>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Subject: Re: A new strategy for pull-up correlated ANY_SUBLINK
Date: 2023-10-13 08:39:41
Message-ID: 00de47dd-7a38-4225-a47e-65a8ba6de8d3@yandex.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 13.10.2023 10:04, Andy Fan wrote:
>
> It seems to me that the expressions "=" and "IN" are equivalent
> here due to the fact that the aggregated subquery returns only one
> value, and the result with the "IN" operation can be considered as
> the intersection of elements on the left and right. In this query,
> we have some kind of set on the left, among which there will be
> found or not only one element on the right.
>
>
> Yes, they are equivalent at the final result, but there are some
> differences at the execution level.  the '=' case will be transformed
> to a Subplan whose subPlanType is EXPR_SUBLINK, so if there
> is more than 1 rows is returned in the subplan, error will be raised.
>
> select * from tenk1 where
>   ten =  (select ten from tenk1 i where i.two = tenk1.two );
>
> ERROR:  more than one row returned by a subquery used as an expression
>
> However the IN case would not.
> select * from tenk1 where
>   ten =  (select ten from tenk1 i where i.two = tenk1.two ) is OK.
>
> I think the test case you added is not related to this feature. the
> difference is there even without the patch.  so I kept the code
> you changed, but not for the test  case.
Yes, I understand and agree with you that we should delete the last
queries, except to one.

The query below have a different result compared to master, and it is
correct.

Without your patch:

explain (costs off)
+SELECT * FROM tenk1 A LEFT JOIN tenk2 B
ON B.hundred in (SELECT min(c.hundred) FROM tenk2 C WHERE c.odd = b.odd);
                                 QUERY PLAN
-----------------------------------------------------------------------------
 Nested Loop Left Join
   ->  Seq Scan on tenk1 a
   ->  Materialize
         ->  Seq Scan on tenk2 b
               Filter: (SubPlan 2)
               SubPlan 2
                 ->  Result
                       InitPlan 1 (returns $1)
                         ->  Limit
                               ->  Index Scan using tenk2_hundred on
tenk2 c
                                     Index Cond: (hundred IS NOT NULL)
                                     Filter: (odd = b.odd)
(12 rows)

After your patch:

postgres=# explain (costs off)
SELECT * FROM tenk1 A LEFT JOIN tenk2 B
ON B.hundred in (SELECT min(c.hundred) FROM tenk2 C WHERE c.odd = b.odd);

                           QUERY PLAN
--------------------------------------------------------------
  Nested Loop Left Join
    ->  Seq Scan on tenk1 a
    ->  Materialize
          ->  Nested Loop
                ->  Seq Scan on tenk2 b
*->  Subquery Scan on "ANY_subquery"
                      Filter: (b.hundred = "ANY_subquery".min)*
                      ->  Aggregate
                            ->  Seq Scan on tenk2 c
                                  Filter: (odd = b.odd)
(10 rows)

>
>> I took the liberty of adding this to your patch and added
>> myself as reviewer, if you don't mind.
>>
>> Sure, the patch after your modification looks better than the
>> original.
>> I'm not sure how the test case around "because of got one row" is
>> relevant to the current changes.  After we reach to some agreement
>> on the above discussion, I think v4 is good for committer to review!
>
> Thank you!) I am ready to discuss it.
>
> Actually I meant to discuss the "Unfortunately, I found a request..",
> looks
> we have reached an agreement there:)
>
Yes, we have)

--
Regards,
Alena Rybakina

Attachment Content-Type Size
pull-up.diff text/x-patch 2.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message a.rybakina 2023-10-13 08:56:37 Re: Removing unneeded self joins
Previous Message shveta malik 2023-10-13 08:35:34 Re: Synchronizing slots from primary to standby