Re: Optimize common expressions in projection evaluation

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peifeng Qiu <pgsql(at)qiupf(dot)dev>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Optimize common expressions in projection evaluation
Date: 2022-12-05 04:36:44
Message-ID: CAFj8pRD=+Ou5-bzkySG=2tJHC+izy0JQBSS5TtrSn6ubm6y7ig@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

po 5. 12. 2022 v 5:28 odesílatel Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> napsal:

> Peifeng Qiu <pgsql(at)qiupf(dot)dev> writes:
> >> the need for this code seems not that great. But as to the code itself
> I'm unable to properly judge.
>
> > A simplified version of my use case is like this:
> > CREATE FOREIGN TABLE ft(rawdata json);
> > INSERT INTO tbl SELECT (convert_func(rawdata)).* FROM ft;
>
> It might be worth noting that the code as we got it from Berkeley
> could do this scenario without multiple evaluations of convert_func().
> Memory is foggy, but I believe it involved essentially a two-level
> targetlist. Unfortunately, the scheme was impossibly baroque and
> buggy, so we eventually ripped it out altogether in favor of the
> multiple-evaluation behavior you see today. I think that commit
> 62e29fe2e might have been what ripped it out, but I'm not quite
> sure. It's about the right time-frame, anyway.
>
> I mention this because trying to reverse-engineer this situation
> in execExpr seems seriously ugly and inefficient, even assuming
> you can make it non-buggy. The right solution has to involve never
> expanding foo().* into duplicate function calls in the first place,
> which is the way it used to be. Maybe if you dug around in those
> twenty-year-old changes you could get some inspiration.
>
> I tend to agree with David that LATERAL offers a good-enough
> solution in most cases ... but it is annoying that we accept
> this syntax and then pessimize it.
>

I agree, so there is a perfect solution like don't use .*, but on second
hand any supported syntax should be optimized well or we should raise some
warning about negative performance impact.

Today there are a lot of baroque technologies in the stack so it is hard to
remember all good practices and it is hard for newbies to take this
knowledge. We should reduce possible performance traps when it is possible.

Regards

Pavel

>
> regards, tom lane
>
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2022-12-05 04:39:37 Re: Bug in row_number() optimization
Previous Message houzj.fnst@fujitsu.com 2022-12-05 04:29:30 RE: Perform streaming logical transactions by background workers and parallel apply