Re: FuncExpr.collid/OpExpr.collid unworkably serving double duty

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: FuncExpr.collid/OpExpr.collid unworkably serving double duty
Date: 2011-03-10 22:08:41
Message-ID: 24322.1299794921@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> On Thu, Mar 10, 2011 at 10:34:00AM -0500, Tom Lane wrote:
>> I suspect this is probably not a good idea because of the added cost in
>> select_common_collation: aside from probably needing more syscache
>> lookups, there's a potential for worse-than-linear cost behavior if we
>> have to repeatedly dig through a deep expression tree to find out
>> collations.

> Two things can make a difference here:

> - If you knew which operators/functions cared about the collation, the
> cost could be manageable. We don't so...

Yeah, the possibility of skipping select_common_collation altogether for
most operators is pretty attractive. Maybe we'll get to that before
we're done, but I don't want to assume it'll be done for 9.1.

> - ISTM that in theory any algorithm that is defined by recursion at
> each node, should be calculatable via a single pass of the tree by
> something like parse_expr. That's essentially what the variables are
> doing in the Expr nodes, though whether you need one or two is
> ofcourse another question.

We could do that if we were willing to go back and fill in the collation
fields after the whole expression tree is built. If you want to fill in
at the time the FuncExpr/OpExpr is first built, then you will get O(N^2)
behavior from repeated calculations in a deep tree if you don't cache
the results for the lower levels. Which is what the output-collation
fields would do for us.

A post-pass is not out of the question, but it's enough unlike
everything else the parser does that I'm not too thrilled about it.

Also, there's the issue that started the whole discussion, which is that
sometimes we *do* need to know, post-parse-analysis, what the result
collation of an expression tree is. See CREATE VIEW. If that's the
*only* thing that ever needed to know it, I wouldn't mind accepting a
double calculation of the collation for CREATE VIEW ... but somehow it
doesn't seem real likely that no other uses for the information will
emerge, and some of them might be more performance-critical.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-03-10 22:16:52 Re: FuncExpr.collid/OpExpr.collid unworkably serving double duty
Previous Message Magnus Hagander 2011-03-10 21:45:14 Re: Indent authentication overloading