Re: [WIP Patch] Using 128-bit integers for sum, avg and statistics aggregates

From: Arthur Silva <arthurprs(at)gmail(dot)com>
To: Andreas Karlsson <andreas(at)proxel(dot)se>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP Patch] Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-10-28 13:05:11
Message-ID: CAO_YK0X2FYVwfJ7Cv+xwk2CTH36dnyxu2VHbEEEuBg0UnxCwGA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Oct 25, 2014 at 12:38 PM, Andreas Karlsson <andreas(at)proxel(dot)se>
wrote:

> Hi,
>
> There was recently talk about if we should start using 128-bit integers
> (where available) to speed up the aggregate functions over integers which
> uses numeric for their internal state. So I hacked together a patch for
> this to see what the performance gain would be.
>
> Previous thread: http://www.postgresql.org/message-id/20141017182500.
> GF2075(at)alap3(dot)anarazel(dot)de
>
> What the patch does is switching from using numerics in the aggregate
> state to int128 and then convert the type from the 128-bit integer in the
> final function.
>
> The functions where we can make use of int128 states are:
>
> - sum(int8)
> - avg(int8)
> - var_*(int2)
> - var_*(int4)
> - stdev_*(int2)
> - stdev_*(int4)
>
> The initial benchmark results look very promising. When summing 10 million
> int8 I get a speedup of ~2.5x and similarly for var_samp() on 10 million
> int4 I see a speed up of ~3.7x. To me this indicates that it is worth the
> extra code. What do you say? Is this worth implementing?
>
> The current patch still requires work. I have not written the detection of
> int128 support yet, and the patch needs code cleanup (for example: I used
> an int16_ prefix on the added functions, suggestions for better names are
> welcome). I also need to decide on what estimate to use for the size of
> that state.
>
> The patch should work and pass make check on platforms where __int128_t is
> supported.
>
> The simple benchmarks:
>
> CREATE TABLE test_int8 AS SELECT x::int8 FROM generate_series(1, 10000000)
> x;
>
> Before:
>
> # SELECT sum(x) FROM test_int8;
> sum
> ----------------
> 50000005000000
> (1 row)
>
> Time: 2521.217 ms
>
> After:
>
> # SELECT sum(x) FROM test_int8;
> sum
> ----------------
> 50000005000000
> (1 row)
>
> Time: 1022.811 ms
>
> CREATE TABLE test_int4 AS SELECT x::int4 FROM generate_series(1, 10000000)
> x;
>
> Before:
>
> # SELECT var_samp(x) FROM test_int4;
> var_samp
> --------------------
> 8333334166666.6667
> (1 row)
>
> Time: 3808.546 ms
>
> After:
>
> # SELECT var_samp(x) FROM test_int4;
> var_samp
> --------------------
> 8333334166666.6667
> (1 row)
>
> Time: 1033.243 ms
>
> Andreas
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>
These are some nice improvements.

As far as I'm aware int128 types are supported on every major compiler when
compiling for 64bit platforms. Right?

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2014-10-28 13:05:20 Re: alter user/role CURRENT_USER
Previous Message Simon Riggs 2014-10-28 12:47:22 Re: New Event Trigger: table_rewrite