Re: WIP: Make timestamptz_out less slow.

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: Make timestamptz_out less slow.
Date: 2015-09-03 04:28:40
Message-ID: CAKJS1f9rv8PxGyGMrO4vAK63suURX+oeV7-5VS7KV+iDe6G6JQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3 September 2015 at 05:10, Andres Freund <andres(at)anarazel(dot)de> wrote:

> On 2015-08-09 12:47:53 +1200, David Rowley wrote:
> > I took a bit of weekend time to finish this one off. Patch attached.
> >
> > A quick test shows a pretty good performance increase:
> >
> > create table ts (ts timestamp not null);
> > insert into ts select generate_series('2010-01-01 00:00:00', '2011-01-01
> > 00:00:00', '1 sec'::interval);
> > vacuum freeze ts;
> >
> > Master:
> > david=# copy ts to 'l:/ts.sql';
> > COPY 31536001
> > Time: 20444.468 ms
> >
> > Patched:
> > david=# copy ts to 'l:/ts.sql';
> > COPY 31536001
> > Time: 10947.097 ms
>
> Yes, that's pretty cool.
>
> > There's probably a bit more to squeeze out of this.
> > 1. EncodeDateTime() could return the length of the string to allow
> callers
> > to use pnstrdup() instead of pstrdup(), which would save the strlen()
> call.
> > 2. Have pg_int2str_zeropad() and pg_int2str() skip appending the NUL char
> > and leave this up to the calling function.
> > 3. Make something to replace the sprintf(str, " %.*s", MAXTZLEN, tzn);
> call.
> >
> > Perhaps 1 and 3 are worth looking into, but I think 2 is likely too error
> > prone for the small gain we'd get from it.
>

I see the logic there, but a few weekends ago I modified the patch to
implement number 2. This is perhaps one optimization that would be hard to
change later since, if we suddenly change pg_int2str() to not append the
NUL, then it might break any 3rd party code that's started using them. The
more I think about it the more I think allowing those to skip appending the
NUL is ok. They're very useful functions for incrementally building strings.

Attached is a patch to implement this, which performs as follows
postgres=# copy ts to 'l:/ts.sql';
COPY 31536001
Time: 10665.297 ms

However, this version of the patch does change many existing functions in
datetime.c so that they no longer append the NUL char... The good news is
that these are all static functions, so nothing else should be using them.

I'm happy to leave 1 and 3 for another rainy weekend one day.

> > Also I was not too sure on if pg_int2str() was too similar to pg_ltoa().
> > Perhaps pg_ltoa() should just be modified to return the end of string?
>
> I don't think the benefits are worth breaking pg_ltoa interface.
>
>
Ok let's leave pg_ltoa() alone

> > /*
> > - * Append sections and fractional seconds (if any) at *cp.
> > + * Append seconds and fractional seconds (if any) at *cp.
> > * precision is the max number of fraction digits, fillzeros says to
> > * pad to two integral-seconds digits.
> > * Note that any sign is stripped from the input seconds values.
> > + * Note 'precision' must not be a negative number.
> > */
> > -static void
> > +static char *
> > AppendSeconds(char *cp, int sec, fsec_t fsec, int precision, bool
> fillzeros)
> > {
> > +#ifdef HAVE_INT64_TIMESTAMP
>
> Wouldn't it be better to just normalize fsec to an integer in that case?
> Afaics that's the only remaining reason for the alternative path?
>
> A special case would need to exist somewhere, unless we just modified
timestamp2tm() to multiply fsec by 1 million when HAVE_INT64_TIMESTAMP is
not defined, but that changes the function signature.

> > +/*
> > + * pg_int2str
> > + * Converts 'value' into a decimal string representation of
> the number.
> > + *
> > + * Caller must ensure that 'str' points to enough memory to hold the
> result
> > + * (at least 12 bytes, counting a leading sign and trailing NUL).
> > + * Return value is a pointer to the new NUL terminated end of string.
> > + */
> > +char *
> > +pg_int2str(char *str, int32 value)
> > +{
> > + char *start;
> > + char *end;
> > +
> > + /*
> > + * Handle negative numbers in a special way. We can't just append
> a '-'
> > + * prefix and reverse the sign as on two's complement machines
> negative
> > + * numbers can be 1 further from 0 than positive numbers, we do it
> this way
> > + * so we properly handle the smallest possible value.
> > + */
> > + if (value < 0)
> > + {
>
> We could alternatively handle this by special-casing INT_MIN, probably
> resulting in a bit less duplicated code.
>

Those special cases seem to do some weird stuff to the performance
characteristics of those functions. I find my method to handle negative
numbers to produce much faster code.

I experimented with finding the fastest way to convert an int to a string
at the weekend, I did happen to be testing with int64's but likely int32
will be the same.

This is the time taken to convert 30 into "30" 2 billion times.

The closest implementation I'm using in the patch is pg_int64tostr() one.
The pg_int64tostr_memcpy() only performs better with bigger numbers /
longer strings.

david(at)ubuntu:~/C$ gcc5.2 tostring.c -o tostring -O2
david(at)ubuntu:~/C$ ./tostring
pg_int64tostr_memcpy in 13.653252 seconds
pg_int64tostr in 2.042616 seconds
pg_int64tostr_new in 2.056688 seconds
pg_lltoa in 13.604653 seconds

david(at)ubuntu:~/C$ clang tostring.c -o tostring_clang -O2
david(at)ubuntu:~/C$ ./tostring_clang
pg_int64tostr_memcpy in 0.000004 seconds
pg_int64tostr in 2.063335 seconds
pg_int64tostr_new in 2.102068 seconds
pg_lltoa in 3.424630 seconds

clang obviously saw right through my pg_int64tostr_memcpy() and optimized
something I'd not intended it to.
Notice how badly pg_lltoa() performs in comparison, especially with gcc.

I'd actually be inclined to consider changing pg_lltoa() to remove the
special case and implement it the same as I've proposed we do in
pg_int2str(). That's not for this patch though, and would likely need
tested with some other compilers.

>
> > /*
> > * Per-opclass comparison functions for new btrees. These are
> > diff --git a/src/tools/msvc/build.pl b/src/tools/msvc/build.pl
> > index e107d41..1e2dd62 100644
> > --- a/src/tools/msvc/build.pl
> > +++ b/src/tools/msvc/build.pl
> > @@ -61,7 +61,7 @@ elsif ($buildwhat)
> > }
> > else
> > {
> > - system("msbuild pgsql.sln /verbosity:detailed
> /p:Configuration=$bconf");
> > + system("msbuild pgsql.sln /verbosity:quiet
> /p:Configuration=$bconf");
> > }
>
> Uh? I assume that's an acciental change?
>
>
Oops, yes.

Attachment Content-Type Size
tostring.c text/x-csrc 4.7 KB
timestamp_out_speedup_2015-09-03_f79f03b.patch application/octet-stream 22.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2015-09-03 04:43:02 Re: Memory prefetching while sequentially fetching from SortTuple array, tuplestore
Previous Message Fujii Masao 2015-09-03 03:45:34 Re: GIN pending clean up is not interruptable