Re: Allowing printf("%m") only where it actually works

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Allowing printf("%m") only where it actually works
Date: 2018-09-26 17:46:45
Message-ID: 20180926174645.nsyj77lx2mvtz4kx@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2018-09-24 13:18:47 -0400, Tom Lane wrote:
> 0002 changes things so that we always use our snprintf, removing all the
> configure logic associated with that.

In the commit message you wrote:

> Preliminary performance testing suggests that as it stands, snprintf.c is
> faster than the native printf functions for some tasks on some platforms,
> and slower for other cases. A pending patch will improve that, though
> cases with floating-point conversions will doubtless remain slower unless
> we want to put a *lot* of effort into that. Still, we've not observed
> that *printf is really a performance bottleneck for most workloads, so
> I doubt this matters much.

I severely doubt the last sentence. I've *many* times seen *printf be a
significant bottleneck. In particular just about any pg_dump of a
database that has large tables with even just a single float column is
commonly bottlenecked on float -> string conversion.

A trivial bad benchmark:

CREATE TABLE somefloats(id serial, data1 float8, data2 float8, data3 float8);
INSERT INTO somefloats(data1, data2, data3) SELECT random(), random(), random() FROM generate_series(1, 10000000);
VACUUM FREEZE somefloats;

postgres[12850][1]=# \dt+ somefloats
List of relations
┌────────┬────────────┬───────┬────────┬────────┬─────────────┐
│ Schema │ Name │ Type │ Owner │ Size │ Description │
├────────┼────────────┼───────┼────────┼────────┼─────────────┤
│ public │ somefloats │ table │ andres │ 575 MB │ │
└────────┴────────────┴───────┴────────┴────────┴─────────────┘

96bf88d52711ad3a0a4cc2d1d9cb0e2acab85e63:

COPY somefloats TO '/dev/null';
COPY 10000000
Time: 24575.770 ms (00:24.576)

96bf88d52711ad3a0a4cc2d1d9cb0e2acab85e63^:

COPY somefloats TO '/dev/null';
COPY 10000000
Time: 12877.037 ms (00:12.877)

IOW, we regress copy performance by about 2x. And one int and three
floats isn't a particularly insane table layout.

I'm not saying we shouldn't default to our printf - in fact I think we
probably past due to use a faster float->string conversion than we
portably get from the OS - but I don't think we can default to our
sprintf without doing something about the float conversion performance.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Khandekar 2018-09-26 18:15:35 Re: Query is over 2x slower with jit=on
Previous Message Sarah Conway Schnurr 2018-09-26 17:29:34 Re: Participate in GCI as a Mentor