Re: Pull up aggregate subquery

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Pull up aggregate subquery
Date: 2011-05-24 03:08:40
Message-ID: BANLkTinsKqGLt3h56DumkVJxDFyBOoOQzw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 23, 2011 at 10:47 PM, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com> wrote:
> That's true. But if the planning cost is an only issue, why not adding
> new GUC for user to choose if they prefer it or not? Of course if we
> have some method to predict which way to go before proving both ways,
> it's great. Do you have some blue picture on it?

I don't really like the idea of adding a GUC for this, unless we
convince ourselves that nothing else is sensible. I mean, that leads
to conversations like this:

Newbie: My query is slow.
Hacker: Turn on enable_magic_pixie_dust and it'll get better.
Newbie: Oh, yeah, that is better. Why isn't this turned on by default, anyway?
Hacker: Well, on pathological queries, it makes planning take way too
long, so we think it's not really safe to enable it by default.
Newbie: Wait... I thought you just told me to enable it. It's not safe?
Hacker: Well, sort of. I mean, uh... hey, look, an elephant!

I think that if you're interested in working on this, the first step
would be to get it to work at all, even if it's a bit inefficient.
Even though in theory you could get exponential planning time growth
if you have an aggregate inside of an aggregate inside of an aggregate
inside of an aggregate, very few people are going to do that. If it
turns out to be a problem, we can dream up some suitable defense; as I
think about it a little more, I don't think that would be terribly
difficult to work out. The hard part is probably to get this to work
in the first place.

> In addition, I wonder if the "generalized nestloop" pattern can do
> whole outer scan before proving inner? I mean, in my example if the
> outer scan qualified more than one tuple I'm afraid that inner Agg and
> its underlying SeqScan are re-scanned more than one. That will bloat
> performance if the Agg is expensive. It would be great if we can outer
> scan can do scan to the end and stores param variables in something
> like array and prove once inner Agg, so that we can ensure the Agg is
> computed once. To do that, do we need more work on executor?

That sounds to me like way too much complexity for the first
go-around. We should either push the agg down or not; it sounds like
you're proposing some kind of half-way state. That might not be
useful in practice, and it will certainly be a lot more work.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-05-24 03:10:13 Re: [BUGS] BUG #6034: pg_upgrade fails when it should not.
Previous Message Robert Haas 2011-05-24 02:58:54 Re: Pull up aggregate subquery