Re: Using 128-bit integers for sum, avg and statistics aggregates

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Andreas Karlsson <andreas(at)proxel(dot)se>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-12-16 10:04:13
Message-ID: CAApHDvqtQxah3SAs6OeajYXYyMMkzDO5h6z6yDUM7=fekEw7aQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 14 November 2014 at 13:57, Andreas Karlsson <andreas(at)proxel(dot)se> wrote:
>
> On 11/13/2014 03:38 AM, Alvaro Herrera wrote:
>
>> configure is a generated file. If your patch touches it but not
>> configure.in, there is a problem.
>>
>
> Thanks for pointing it out, I have now fixed it.
>
>
>
Hi Andreas,

These are some very promising performance increases.

I've done a quick pass of reading the patch. I currently don't have a
system with a 128bit int type, but I'm working on that.

Just a couple of things that could do with being fixed:

This fragment needs fixed to put braces on new lines
if (state) {
numstate.N = state->N;
int16_to_numericvar(state->sumX, &numstate.sumX);
int16_to_numericvar(state->sumX2, &numstate.sumX2);
} else {
numstate.N = 0;
}

It also looks like your OIDs have been nabbed by some jsonb stuff.

DETAIL: Key (oid)=(3267) is duplicated.

I'm also wondering why in numeric_int16_sum() you're doing:

#else
return numeric_sum(fcinfo);
#endif

but you're not doing return int8_accum() in the #else part
of int8_avg_accum()
The same goes for int8_accum_inv() and int8_avg_accum_inv(), though perhaps
you're doing it here because of the elog() showing the wrong function name.
Although that's a pretty much "shouldn't ever happen" case that mightn't be
worth worrying about.

Also since I don't currently have a machine with a working int128, I
decided to benchmark master vs patched to see if there was any sort of
performance regression due to numeric_int16_sum calling numeric_sum, but
I'm a bit confused with the performance results as it seems there's quite a
good increase in performance with the patch, I'd have expected there to be
no change.

CREATE TABLE t (value bigint not null);
insert into t select a.a from generate_series(1,5000000) a(a);
vacuum;

int128_bench.sql has select sum(value) from t;

Master:
D:\Postgres\installb\bin>pgbench.exe -f d:\int128_bench.sql -n -T 120
postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 120 s
number of transactions actually processed: 92
latency average: 1304.348 ms
tps = 0.762531 (including connections establishing)
tps = 0.762642 (excluding connections establishing)

Patched:
D:\Postgres\install\bin>pgbench.exe -f d:\int128_bench.sql -n -T 120
postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 120 s
number of transactions actually processed: 99
latency average: 1212.121 ms
tps = 0.818067 (including connections establishing)
tps = 0.818199 (excluding connections establishing)

Postgresql.conf is the same in both instances.
I've yet to discover why this is any faster.

Regards

David Rowley

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2014-12-16 10:06:48 Re: Commitfest problems
Previous Message Satoshi Nagayasu 2014-12-16 09:57:16 Re: pg_rewind in contrib