Re: Using 128-bit integers for sum, avg and statistics aggregates

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Andreas Karlsson <andreas(at)proxel(dot)se>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-12-26 12:57:33
Message-ID: CAApHDvp9ZUC5fUur-_5nJ7VPUYe_G+pcUTb73pvw0k0rH6xgzg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 24 December 2014 at 16:04, Andreas Karlsson <andreas(at)proxel(dot)se> wrote:

On 12/16/2014 11:04 AM, David Rowley wrote:> These are some very promising
> performance increases.
>
>>
>> I've done a quick pass of reading the patch. I currently don't have a
>> system with a 128bit int type, but I'm working on that.
>>
>
> Sorry for taking some time to get back. I have been busy before Christmas.
> A new version of the patch is attached.
>
>
Ok I've had another look at this, and I think the only things that I have
to say have been mentioned already:

1. Do we need to keep the 128 byte aggregate state size for machines
without 128 bit ints? This has been reduced to 48 bytes in the patch, which
is in favour code being compiled with a compiler which has 128 bit ints. I
kind of think that we need to keep the 128 byte estimates for compilers
that don't support int128, but I'd like to hear any counter arguments.

2. References to int16 meaning 16 bytes. I'm really in two minds about
this, it's quite nice to keep the natural flow, int4, int8, int16, but I
can't help think that this will confuse someone one day. I think it'll be a
long time before it confused anyone if we called it int128 instead, but I'm
not that excited about seeing it renamed either. I'd like to hear what
others have to say... Is there a chance that some sql standard in the
distant future will have HUGEINT and we might regret not getting the
internal names nailed down?

I also checked the performance of some windowing function calls, since
these call the final function for each row, I thought I'd better make sure
there was no regression as the final function must perform a conversion
from int128 to numeric for each row.

It seems there's still an increase in performance:

Setup:
create table bi_win (i bigint primary key);
insert into bi_win select x.x from generate_series(1,10000) x(x);
vacuum analyze;

Query:
select sum(i) over () from bi_win;

** Master
./pgbench -f ~/int128_window.sql -n -T 60 postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 6567
latency average: 9.137 ms
tps = 109.445841 (including connections establishing)
tps = 109.456941 (excluding connections establishing)

** Patched
./pgbench -f ~/int128_window.sql -n -T 60 postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 7841
latency average: 7.652 ms
tps = 130.670253 (including connections establishing)
tps = 130.675743 (excluding connections establishing)

Setup:
create table i_win (i int primary key);
insert into i_win select x.x from generate_series(1,10000) x(x);
vacuum analyze;

Query:
select stddev(i) over () from i_win;

** Master
./pgbench -f ~/int128_window.sql -n -T 60 postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 5084
latency average: 11.802 ms
tps = 84.730362 (including connections establishing)
tps = 84.735693 (excluding connections establishing)

** Patched
./pgbench -f ~/int128_window.sql -n -T 60 postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 7557
latency average: 7.940 ms
tps = 125.934787 (including connections establishing)
tps = 125.943176 (excluding connections establishing)

Regards

David Rowley

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2014-12-26 15:23:37 Re: BUG #12330: ACID is broken for unique constraints
Previous Message Magnus Hagander 2014-12-26 11:21:48 Re: Additional role attributes && superuser review