[WIP Patch] Using 128-bit integers for sum, avg and statistics aggregates

From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: [WIP Patch] Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-10-25 14:38:41
Message-ID: 544BB5F1.50709@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

There was recently talk about if we should start using 128-bit integers
(where available) to speed up the aggregate functions over integers
which uses numeric for their internal state. So I hacked together a
patch for this to see what the performance gain would be.

Previous thread:
http://www.postgresql.org/message-id/20141017182500.GF2075@alap3.anarazel.de

What the patch does is switching from using numerics in the aggregate
state to int128 and then convert the type from the 128-bit integer in
the final function.

The functions where we can make use of int128 states are:

- sum(int8)
- avg(int8)
- var_*(int2)
- var_*(int4)
- stdev_*(int2)
- stdev_*(int4)

The initial benchmark results look very promising. When summing 10
million int8 I get a speedup of ~2.5x and similarly for var_samp() on 10
million int4 I see a speed up of ~3.7x. To me this indicates that it is
worth the extra code. What do you say? Is this worth implementing?

The current patch still requires work. I have not written the detection
of int128 support yet, and the patch needs code cleanup (for example: I
used an int16_ prefix on the added functions, suggestions for better
names are welcome). I also need to decide on what estimate to use for
the size of that state.

The patch should work and pass make check on platforms where __int128_t
is supported.

The simple benchmarks:

CREATE TABLE test_int8 AS SELECT x::int8 FROM generate_series(1,
10000000) x;

Before:

# SELECT sum(x) FROM test_int8;
sum
----------------
50000005000000
(1 row)

Time: 2521.217 ms

After:

# SELECT sum(x) FROM test_int8;
sum
----------------
50000005000000
(1 row)

Time: 1022.811 ms

CREATE TABLE test_int4 AS SELECT x::int4 FROM generate_series(1,
10000000) x;

Before:

# SELECT var_samp(x) FROM test_int4;
var_samp
--------------------
8333334166666.6667
(1 row)

Time: 3808.546 ms

After:

# SELECT var_samp(x) FROM test_int4;
var_samp
--------------------
8333334166666.6667
(1 row)

Time: 1033.243 ms

Andreas

Attachment Content-Type Size
int128-agg-v1.patch text/x-patch 29.3 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-10-25 15:12:42 Re: INSERT ... ON CONFLICT {UPDATE | IGNORE}
Previous Message Ali Akbar 2014-10-25 13:58:01 Re: Function array_agg(array)