Re: WIP: Make timestamptz_out less slow.

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: Make timestamptz_out less slow.
Date: 2015-11-05 00:20:53
Message-ID: CAM3SWZRCHQ7xNZe3oehahSDGKVtgTn+vD9xE4MkLMmasPihaCQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 11, 2015 at 9:00 AM, David Rowley
<david(dot)rowley(at)2ndquadrant(dot)com> wrote:
> Attached is also timestamp_delta.patch which changes pg_ltostr() to use the
> INT_MIN special case that pg_ltoa also uses. I didn't make a similar change
> to pg_ltostr_zeropad() as I'd need to add even more special code to prepend
> the correct number of zero before the 2147483648. This zero padding reason
> was why I originally came up with the alternative way to handle the negative
> numbers. I had just thought that it was best to keep both functions as
> similar as possible.

I've started looking at this -- "timestamp_out_speedup_2015-09-12.patch".

> I've not done anything yet to remove the #ifdef HAVE_INT64_TIMESTAMP from
> AppendSeconds(). The only way I can think to handle this is to just make
> fsec_t unconditionally an int64 (since with float datetimes the precision is
> 10 decimal digits after the dec point, this does not fit in int32). I'd go
> off and do this, but this means I need to write an int64 version of
> pg_ltostr(). Should I do it this way?

Have you thought about *just* having an int64 pg_ltostr()? Are you
aware of an appreciable overhead from having int32 callers just rely
on a promotion to int64?

In general, years are 1900-based in Postgres:

struct pg_tm
{
...
int tm_mday;
int tm_mon; /* origin 0, not 1 */
int tm_year; /* relative to 1900 */
...
};

While it's not your patch's fault, it is not noted by EncodeDateTime()
and EncodeDateOnly() and others that there pg_tm structs are not
1900-based. This is in contrast to multiple functions within
formatting.c, nabstime.c, and timestamp.c (some of which are clients
or clients of clients for this new code). I think that the rule has
been undermined so much that it doesn't make sense to indicate
exceptions directly, though. I think pg_tm should have comments saying
that there are many cases where tm_year is not relative to 1900.

I have a few minor nitpicks about the patch.

No need for "negative number" comment here -- just use
"Assert(precision >= 0)" code:

+ * Note 'precision' must not be a negative number.
+ * Note callers should assume cp will not be NUL terminated.
*/
-static void
+static char *
AppendSeconds(char *cp, int sec, fsec_t fsec, int precision, bool fillzeros)
{
+#ifdef HAVE_INT64_TIMESTAMP
+

I would just use a period/full stop here:

+ * Note str is not NUL terminated, callers are responsible for NUL terminating
+ * str themselves.

Don't think you need so many "Note:" specifiers here:

+ * Note: Callers should ensure that 'padding' is above zero.
+ * Note: This function is optimized for the case where the number is not too
+ * big to fit inside of the specified padding.
+ * Note: Caller must ensure that 'str' points to enough memory to hold the
+ result (at least 12 bytes, counting a leading sign and trailing NUL,
+ or padding + 1 bytes, whichever is larger).
+ */
+char *
+pg_ltostr_zeropad(char *str, int32 value, int32 padding)

Not so sure about this, within the new pg_ltostr_zeropad() function:

+ /*
+ * Handle negative numbers in a special way. We can't just append a '-'
+ * prefix and reverse the sign as on two's complement machines negative
+ * numbers can be 1 further from 0 than positive numbers, we do it this way
+ * so we properly handle the smallest possible value.
+ */
+ if (num < 0)
+ {
...
+ }
+ else
+ {
...
+ }

Maybe it's better to just special case INT_MIN and the do an Abs()?
Actually, would any likely caller of pg_ltostr_zeropad() actually care
if INT_MIN was a case that you cannot handle, and assert against? You
could perhaps reasonably make it the caller's problem. Haven't made up
my mind if your timestamp_delta.patch is better than that; cannot
immediately see a problem with putting it on the caller.

More generally, maybe routines like EncodeDateTime() ought now to use
the Abs() macro.

Those are all of my nitpicks. :-)

> I also did some tests on server grade hardware, and the performance increase
> is looking a bit better.
>
> create table ts (ts timestamp not null);
> insert into ts select generate_series('2010-01-01 00:00:00', '2011-01-01
> 00:00:00', '1 sec'::interval);
> vacuum freeze ts;

On my laptop, I saw a 2.4X - 2.6X speedup for this case. That seems
pretty nice to me.

Will make another pass over this soon.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2015-11-05 01:02:16 Re: Foreign join pushdown vs EvalPlanQual
Previous Message Joshua D. Drake 2015-11-05 00:18:44 Re: Request: pg_cancel_backend variant that handles 'idle in transaction' sessions