Re: WIP: Faster Expression Processing v4

From: Douglas Doole <dougdoole(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: Faster Expression Processing v4
Date: 2017-03-14 22:03:45
Message-ID: CADE5jYL7a8_b2LtwTJJk4ALLAKfCN1o9PWY3jS6D0e+3niOX8g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andres, sorry I haven't had a chance to look at this great stuff you've
been doing. I've wanted to get to it, but work keeps getting in the way. ;-)

I do have one observation based on my experiments with your first version
of the code. In my tests, I found that expression init becomes a lot more
expensive in this new model. (That's neither a surprise, nor a concern.) In
particular, the function ExprEvalPushStep() is quite hot. In my code I made
the following changes:

* Declare ExprEvalPushStep() "inline".
* Remove the "if (es->steps_alloc == 0)" condition
from ExprEvalPushStep().
* In ExecInitExpr(), add:
state->steps_alloc = 16;
state->steps = palloc(sizeof(ExprEvalStep) * es->steps_alloc);

I found that this cut the cost of initializing the expression by about 20%.
(Of course, that was on version 1 of your code, so the benefit may well be
different now.)

On Tue, Mar 14, 2017 at 11:51 AM Andres Freund <andres(at)anarazel(dot)de> wrote:

> > Hmm. Could we make the instructions variable size? It would allow packing
> > the small instructions even more tight, and we wouldn't need to obsess
> over
> > a particular maximum size for more complicated instructions.
>
> That makes jumps a lot more complicated. I'd experimented with it and
> given it up as "not worth it".

Back when I was at IBM, I spent a lot of time doing stuff like this. If you
want to commit with the fixed size arrays, I'm happy to volunteer to look
at packing it tighter as a follow-on piece of work. (It was already on my
list of things to try anyhow.)

> If we were to try to do so, we'd also
> not like storing the pointer and enum variants both, since it'd again
> would reduce the density.
>

From my experience, it's worth the small loss in density to carry around
both the pointer and the enum - it makes debugging so much easier.

- Doug
Salesforce

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2017-03-14 22:08:32 Re: multivariate statistics (v25)
Previous Message Peter Geoghegan 2017-03-14 21:48:40 Re: GUC for cleanup indexes threshold.