Re: WIP: Faster Expression Processing v4

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: Faster Expression Processing v4
Date: 2017-03-25 16:22:15
Message-ID: 5768.1490458935@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

More random musing ... have you considered making the jump-target fields
in expressions be relative rather than absolute indexes? That is,
EEO_JUMP would look like

op += (stepno); \
EEO_DISPATCH(); \

instead of

op = &state->steps[stepno]; \
EEO_DISPATCH(); \

I have not carried out a full patch to make this work, but just making
that one change and examining the generated assembly code looks promising.
Instead of this

movslq 40(%r14), %r8
salq $6, %r8
addq 24(%rbx), %r8
movq %r8, %r14
jmp *(%r8)

we get this

movslq 40(%r14), %rax
salq $6, %rax
addq %rax, %r14
jmp *(%r14)

which certainly looks like it ought to be faster. Also, the real reason
I got interested in this at all is that with relative jumps, groups of
steps would be position-independent within the steps array, which would
enable some compile-time tricks that seem impractical with the current
definition.

BTW, now that I've spent a bit of time looking at the generated assembly
code, I'm kind of disinclined to believe any arguments about how we have
better control over branch prediction with the jump-threading
implementation. At least with current gcc (6.3.1 on Fedora 25) at -O2,
what I see is multiple places jumping to the same indirect jump
instruction :-(. It's not a total disaster: as best I can tell, all the
uses of EEO_JUMP remain distinct. But gcc has chosen to implement about
40 of the 71 uses of EEO_NEXT by jumping to the same couple of
instructions that increment the "op" register and then do an indirect
jump :-(.

So it seems that we're at the mercy of gcc's whims as to which instruction
dispatches will be distinguishable to the hardware; which casts a very
dark shadow over any benchmarking-based arguments that X is better than Y
for branch prediction purposes. Compiler version differences are likely
to matter a lot more than anything we do.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2017-03-25 16:24:09 Re: pgsql: Add COMMENT and SECURITY LABEL support for publications and subs
Previous Message Stephen Frost 2017-03-25 16:21:16 Re: Monitoring roles patch