Re: WIP: Faster Expression Processing v4

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: Faster Expression Processing v4
Date: 2017-03-24 18:27:42
Message-ID: 20170324182742.3r4qndpugtx6ga3t@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2017-03-24 11:26:27 -0400, Tom Lane wrote:
> Another modest proposal:
>
> I'm not really sold on the approach of using EEOP_FETCHSOME opcodes to
> trigger initial tupleslot de-forming. Certainly we want to have a single
> slot_getsomeattrs call per source slot, but as-is, we need a separate
> traversal over the expression tree just to precompute the max attribute
> number needed. That can't be good for expression compile speed, and
> it introduces nontrivial bug risks because the code that does that
> is completely decoupled from the code that emits the EEOP_VAR opcodes
> (which are what's really relying on the de-forming to have happened).

Hm. We had the separate traversal for projections for a long while, and
I don't think there've been a a lot of changes to the extraction of the
last attribute number. I'm very doubtful that the cost of traversing
the expression twice is meaningful in comparison to the other costs.

> What do you think about a design like this:
>
> * Drop the FETCHSOME opcodes.
>
> * Add fields to struct ExprState that will hold the maximum inner,
> outer, and scan attribute numbers needed.
>
> * ExecInitExpr initializes those fields to zero, and then during
> ExecInitExprRec, whenever we generate an EEOP_VAR opcode, we do e.g.
>
> state->last_inner_attno = Max(state->last_inner_attno,
> variable->varattno);
>
> * ExecInitExprSlots, get_last_attnums_walker, etc all go away.
>
> * In the startup segment of ExecInterpExpr, add
>
> if (state->last_inner_attno > 0)
> slot_getsomeattrs(innerslot, state->last_inner_attno);
> if (state->last_outer_attno > 0)
> slot_getsomeattrs(outerslot, state->last_outer_attno);
> if (state->last_scan_attno > 0)
> slot_getsomeattrs(scanslot, state->last_scan_attno);
>
> This would be a little different from the existing code as far as runtime
> branch-prediction behavior goes, but it's not apparent to me that it'd be
> any worse.

I'd be suprised if it weren't.

I'm not super strongly against this setup, but I fail to see the benefit
of whacking this around. I've benchmarked the previous/current setup
fairly extensively, and I'd rather not redo that. In contrast to the
other changes you've talked about, this definitely is in the hot-path...

> Also, for JIT purposes it'd still be entirely possible to compile the
> slot_getsomeattrs calls in-line; you'd just be looking to a different
> part of the ExprState struct to find out what to do.

Yea, we could do that.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Banck 2017-03-24 18:32:25 Re: Create replication slot in pg_basebackup if requested and not yet present
Previous Message Thomas Munro 2017-03-24 18:27:40 Re: pg_serial early wraparound