Re: Parameterized aggregate subquery (was: Pull up aggregate subquery)

From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Yeb Havinga <yebhavinga(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parameterized aggregate subquery (was: Pull up aggregate subquery)
Date: 2011-07-02 07:48:05
Message-ID: CAP7QgmnueoSExZwozsP9HE3JtDUzCUxQjyBTES2ksibsuYLj3w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2011/6/29 Yeb Havinga <yebhavinga(at)gmail(dot)com>:
>
> On 2011-06-17 09:54, Hitoshi Harada wrote:
>>
>> While reviewing the gist/box patch, I found some planner APIs that can
>> replace parts in my patch. Also, comments in includes wasn't updated
>> appropriately. Revised patch attached.
>
> Hello Hitoshi-san,
>
> I read your latest patch implementing parameterizing subquery scans.

Attached is revised version.

> 1)
> In the email from june 9 with the patch You wrote: "While IndexScan
> is simple since its information like costs are well known by the base
> relation, SubqueryScan should re-plan its Query to gain that, which is
> expensive."
>
> Initial concerns I had were caused by misinterpreting 'replanning' as: for
> each outer tuple, replan the subquery (it sounds a bit like 'ReScan').
> Though the general comments in the patch are helpful, I think it would be an
> improvement to describe why subqueries are planned more than once, i.e.
> something like
> "best_inner_subqueryscan
>   Try to find a better subqueryscan path and its associated plan for each
> join clause that can be pushed down, in addition to the path that is already
> calculated by set_subquery_pathlist."

I changed comments around set_subquery_pathlist and best_inner_subqueryscan.

> 2)
> Since 'subquery_is_pushdown_safe' is invariant under join clause push down,
> it might be possible to have it called only once in set_subquery_pathlist,
> i.e. by only attaching rel->preprocessed_subquery if the
> subquery_is_pushdown_safe.

I modified as you suggested.

> 3)
> /*
>  * set_subquery_pathlist
>  *        Build the (single) access path for a subquery RTE
>  */
> This unchanged comment is still correct, but 'the (single) access path'
> might have become a bit misleading now there are multiple paths possible,
> though not by set_subquery_pathlist.

As noted #1.

> 4) somewhere in the patch
> s/regsitered/registered/

Fixed.

> 5) Regression tests are missing; I think at this point they'd aid in
> speeding up development/test cycles.

I'm still thinking about it. I can add complex test but the concept of
regression test focuses on small pieces of simple cases. I don't want
take pg_regress much more than before.

> 6) Before patching Postgres, I could execute the test with the size_l and
> size_m tables, after patching against current git HEAD (patch without
> errors), I get the following error when running the example query:
> ERROR:  plan should not reference subplan's variable
>
> I get the same error with the version from june 9 with current git HEAD.

Fixed. Some variable initializing was wrong.

> 7) Since both set_subquery_pathlist and best_inner_subqueryscan push down
> clauses, as well as add a path and subplan, could a generalized function be
> made to support both, to reduce duplicate code?

No touch as answered before.

Although I still need to think about suitable regression test case,
the patch itself can be reviewed again. You may want to try some
additional tests as you imagine after finding my test case gets
quicker.

Thanks,

--
Hitoshi Harada

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hitoshi Harada 2011-07-02 08:02:07 Re: Parameterized aggregate subquery (was: Pull up aggregate subquery)
Previous Message Kohei KaiGai 2011-07-02 06:35:48 Re: SECURITY LABEL on shared database object