Re: Arrays of domains

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Arrays of domains
Date: 2017-09-28 21:44:43
Message-ID: 5970.1506635083@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com> writes:
> On 09/28/2017 01:11 PM, Tom Lane wrote:
>>> I wonder if we need to do any benchmarking to assure ourselves that the
>>> changes to ArrayCoerceExpr don't have a significant performance impact?

>> That would likely be a good idea, though I'm not very sure what or
>> how to benchmark.

> Some case where we only expect the current code to produce a single
> ArrayCoerceExpr, I guess. say doing text[] -> int[] ?

I spent some time looking into this. I settled on int4[] -> int8[]
as the appropriate case to test, because int48() is about as cheap
a cast function as we have. Q1 is the base case without a cast:

select count(x) from
(select array[i,i,i,i,i,i,i,i,i,i] as x
from generate_series(1,10000000) i) ss;

Q2 is same with a cast added:

select count(x::int8[]) from
(select array[i,i,i,i,i,i,i,i,i,i] as x
from generate_series(1,10000000) i) ss;

Q3 and Q4 are the same thing, but testing 100-element instead of
10-element arrays:

select count(x) from
(select array[
i,i,i,i,i,i,i,i,i,i,
i,i,i,i,i,i,i,i,i,i,
i,i,i,i,i,i,i,i,i,i,
i,i,i,i,i,i,i,i,i,i,
i,i,i,i,i,i,i,i,i,i,
i,i,i,i,i,i,i,i,i,i,
i,i,i,i,i,i,i,i,i,i,
i,i,i,i,i,i,i,i,i,i,
i,i,i,i,i,i,i,i,i,i,
i,i,i,i,i,i,i,i,i,i
] as x
from generate_series(1,10000000) i) ss;

select count(x::int8[]) from
(select array[
i,i,i,i,i,i,i,i,i,i,
i,i,i,i,i,i,i,i,i,i,
i,i,i,i,i,i,i,i,i,i,
i,i,i,i,i,i,i,i,i,i,
i,i,i,i,i,i,i,i,i,i,
i,i,i,i,i,i,i,i,i,i,
i,i,i,i,i,i,i,i,i,i,
i,i,i,i,i,i,i,i,i,i,
i,i,i,i,i,i,i,i,i,i,
i,i,i,i,i,i,i,i,i,i
] as x
from generate_series(1,10000000) i) ss;

I get these query timings in a non-cassert build:

HEAD Patch

Q1 5453.235 ms 5440.876 ms
Q2 9340.670 ms 10191.194 ms
Q3 19078.457 ms 18967.279 ms
Q4 48196.338 ms 58547.531 ms

(Timings are reproducible to a few percent.)

So that's a bit disappointing; the per-element overhead is clearly
noticeably more than before. However, poking into it with "perf"
gives some grounds for optimism; Q4's hotspots with the patch are

Children Self Samples Command Shared Object Symbol
+ 33.44% 33.35% 81314 postmaster postgres [.] ExecInterpExpr
+ 21.88% 21.83% 53223 postmaster postgres [.] array_map
+ 15.19% 15.15% 36944 postmaster postgres [.] CopyArrayEls
+ 14.63% 14.60% 35585 postmaster postgres [.] ArrayCastAndSet
+ 6.07% 6.06% 14765 postmaster postgres [.] construct_md_array
+ 1.80% 1.79% 4370 postmaster postgres [.] palloc0
+ 0.77% 0.77% 1883 postmaster postgres [.] AllocSetAlloc
+ 0.75% 0.74% 1815 postmaster postgres [.] int48
+ 0.52% 0.52% 1276 postmaster postgres [.] advance_aggregates

Surely we could get the amount of time spent in ExecInterpExpr down.

One idea is to make a dedicated evalfunc for the case where the
expression is just EEOP_CASE_TESTVAL + EEOP_FUNCEXPR[_STRICT],
similar to the existing fast-path routines (ExecJust*). I've not
actually tried to do that, but a reasonable guess is that it'd about
halve that overhead, putting this case back on a par with the HEAD code.
Also, I'd imagine that Andres' planned work on JIT-compiled expressions
would put this back on par with HEAD, if not better, for installations
using that.

Also I believe that Andres has plans to revamp the CaseTestExpr mechanism,
which might shave a few cycles as well. Right now there's an extra copy
of each array datum + isnull value, because array_map sticks those into
one pair of variables and then the EEOP_CASE_TESTVAL opcode just moves
them somewhere else. It's reasonable to hope that we could redesign that
so that array_map sticks the values straight into where they're needed,
and then we need only the FUNCEXPR opcode, which'd be a great candidate
for an ExecJust* fast-path.

Assuming that that's going to happen for v11, I'm inclined to leave the
optimization problem until the dust settles around CaseTestExpr.
Does anyone feel that this can't be committed before that's addressed?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2017-09-28 21:47:30 Re: [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
Previous Message Alvaro Herrera 2017-09-28 21:39:59 Re: [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple