Re: [PATCH] Negative Transition Aggregate Functions (WIP)

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Negative Transition Aggregate Functions (WIP)
Date: 2014-04-05 07:38:55
Message-ID: CAEZATCUMuofd+H1JL3Ao9ey6++ERz1kTTh4UATxkFnG-otrhSw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 4 April 2014 11:56, Florian Pflug <fgp(at)phlo(dot)org> wrote:
>
>> On 04.04.2014, at 09:40, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
>>
>> I'm not sure how much additional work is required to sort this out,
>> but to me it looks more realistic to target 9.5 than 9.4, so at this
>> point I tend to think that the patch ought to be marked as returned
>> with feedback.
>

Just doing the first optimisation I recommended (which I
pessimistically referred to as a "micro-optimisation") actually gives
up to a 4x performance boost relative to the current patch for the
queries above:

SELECT SUM(n::int) OVER (ROWS BETWEEN CURRENT ROW AND 10 FOLLOWING)
FROM generate_series(1,20000) g(n);
-- 20ms (was 33ms)

SELECT SUM(n::int) OVER (ROWS BETWEEN CURRENT ROW AND 100 FOLLOWING)
FROM generate_series(1,20000) g(n);
-- 54ms (was 173ms)

SELECT SUM(n::int) OVER (ROWS BETWEEN CURRENT ROW AND 1000 FOLLOWING)
FROM generate_series(1,20000) g(n);
-- 368ms (was 1467ms)

SELECT SUM(n::int) OVER (ROWS BETWEEN CURRENT ROW AND 10000 FOLLOWING)
FROM generate_series(1,20000) g(n);
-- 1866ms (was 7709ms)

See the attached patch, which applies on top of yours but is actually
independent of it.

IMO, this doesn't go far enough though. We should be aiming to
eliminate the O(n^2) behaviour completely and have all the above
queries take roughly the same amount of time.

> I think the patch is worthwhile, even without this additional optimization. In fact, If the optimization was part of the patch, there would probably be calls to factor it out, on the ground that the patch is already rather large.
>
> I don't see what bumping the whole thing to 9.5 buys, compared do applying what we have now, and optimizing in 9.5 further.
>

The problem with the current state of the patch is that we're going to
a lot of effort to add this new inverse aggregate function,
documenting it's benefits and revealing via EXPLAIN how it can reduce
by several orders of magnitude the number of transition function
calls, but then not giving a commensurate performance boost unless the
window is UNBOUNDED FOLLOWING. That's going to be awkward to explain
to users, and so releasing it in this state feels a little half-baked
to me.

Regards,
Dean

Attachment Content-Type Size
window_gettupleslot.patch text/x-diff 1.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2014-04-05 07:55:01 Re: [PATCH] Negative Transition Aggregate Functions (WIP)
Previous Message Amit Kapila 2014-04-05 06:52:07 Re: [bug fix] pg_ctl always uses the same event source