Re: [PATCH] Improve geometric types

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: emre(at)hasegeli(dot)com
Cc: a(dot)alekseev(at)postgrespro(dot)ru, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Improve geometric types
Date: 2017-09-14 07:33:48
Message-ID: 20170914.163348.59273173.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Tue, 12 Sep 2017 19:30:44 +0200, Emre Hasegeli <emre(at)hasegeli(dot)com> wrote in <CAE2gYzxDRvNdhrWQa7ym423uvHLWJXkqx=BoJePYMdrKPR1Zhg(at)mail(dot)gmail(dot)com>
> > Hello, sorry to late for the party, but may I comment on this?
>
> Thank you for picking this up again.
>
> > The first patch reconstructs the operators in layers. These
> > functions are called very frequently when used. Some function are
> > already inlined in float.h but some static functions in float.h
> > also can be and are better be inlined. Some of *_internal,
> > point_construct, line_calculate_point and so on are the
> > candidates.
>
> They are static functions. I though compiler can decide to inline
> them. Do you think adding "inline" to the function signatures are
> necessary?

C++ surely make just static functions inlined but I'm not sure C
compiler does that. "static" is a scope modifier and "inline" is
not (what kind of modifier is it?). In regard to gcc,

https://gcc.gnu.org/onlinedocs/gcc-7.2.0/gcc/Inline.html#Inline

> By declaring a function inline, you can direct GCC to make
> calls to that function faster. <snip> You can also direct GCC
> to try to integrate all “simple enough” functions into their
> callers with the option -finline-functions.

I didn't find that postgresql uses the options so I think we
shouldn't expect that they are inlined automatically.

> > You removed some DirectFunctionCall to the functions within the
> > same file but other functions remain in the style,
> > ex. poly_center or on_sl. The function called from the former
> > seems large enough but the latter function calls a so small
> > function that it could be inlined. Would you like to make some
> > additional functions use C call (instead of DirectFunctionCall)
> > and inlining them?
>
> I tried to minimise my changes to make reviewing easier. I can make
> "_internal" functions for the remaining DirectFunctionCall()s, if you
> find it necessary.

Ok, it would be another patch even if we need that.

> > This is not a fault of this patch, but some functions like on_pb
> > seems missing comment to describe what it is. Would you like to
> > add some?
>
> I will add some on the next version.
>
> > In the second patch, the additional include fmgrprotos.h in
> > btree_gin.c seems needless.
>
> It must be something I missed on rebase. I will remove it.
>
> > Some float[48] features were macros
> > so that they share the same expressions between float4 and
> > float8. They still seems sharing perfectly the same expressions
> > in float.h. Is there any reason for converting them into typed
> > inline functions?
>
> Kevin Grittner suggested using inline functions instead of macros.
> They are easier to use compared to macros, and avoid double-evaluation
> hazards.

I recall a bit about the double-evaluation hazards. I think the
functions needs a comment describing the reasons so that anyone
kind won't try to merge them into a macro again.

> > In float.h, MAXDOUBLEWIDTH is redueced from 500 to 128, but the
> > exponent of double is up to 308 so it doesn't seem sufficient. On
> > the other hand we won't use non-scientific notation for extremely
> > large numbers and it requires (perhaps) up to 26 bytes in the
> > case. In the soruce code, most of them uses "%e" and one of them
> > uses '%g". %e always takes the format of
> > "-1.(17digits)e+308".. So it would be less than 26
> > characters.
> >
> > =# set extra_float_digits to 3;
> > =# select -1.221423424320453e308::float8;
> > ?column?
> > ---------------------------
> > -1.22142342432045302e+308
> >
> > man printf: (linux)
> >> Style e is used if the exponent from its conversion is less than
> >> -4 or greater than or equal to the precision.
> >
> > So we should be safe to have a buffer with 26 byte length and 500
> > bytes will apparently too large and even 128 will be too loose in
> > most cases. So how about something like the following?
> >
> > #define MINDOUBLEWIDTH 32
>
> Should it be same for float4 and float8?

Strictly they are different each other but float8 needs 26 bytes
(additional 1 byte is the sign for expnent, which I forgot.) and
float4 needs 15 bytes so it could be reduced to 32 and 16
respectively.

| select -1.11111111111111111111111111e-38::float4;
| -1.11111113e-38
| select -1.11111111111111111111111111e-4::float4;
| -0.000111111112
| select -1.11111111111111111111111111e-5::float4;
| -1.11111112e-05

On the other hand float4 is anyway converted into double in
float4out and I'm not sure that the extension doesn't adds
spurious digits. Therefore I think it would be reasonable that
"%g" formatter requires up to 27 bytes (26 + terminating zero) so
it should use MINDOUBLEWIDTH regardless of the precision of the
value given to the formatter.

Addition to that if we are deliberately using %g in float4out, we
can use flot8out_internal instead of most of the stuff of
float4out.

| Datum
| float4out(PG_FUNCTION_ARGS)
| {
| float8 num = PG_GETARG_FLOAT4(0);
|
| PG_RETURN_CSTRING(float8out_internal(num));
| }

> > ...
> > float4out(at)float(dot)c<modified>:
> >> int ndig = FLT_DIG + extra_float_digits;
> >>
> >> if (ndig < 1)
> >> ndig = 1;
> >>
> >> len = snprintf(ascii, MINDOUBLEWIDTH + 1, "%+.*g", ndig, num);
> >> if (len > MINDOUBLEWIDTH + 1)
> >> {
> >> ascii = (char *) repalloc(ascii, len);
> >> if (snprintf(ascii, len, "%+.*e", ndig, num) > len)
> >> error(ERROR, "something wrong happens...");
> >> }
> >
> > I don't think the if part can be used so there would be no
> > performance degradation, I believe.
>
> Wouldn't this change the output of the float datatypes? That would be
> a backwards incompatible change.

It just reduces the unused part after terminator \0, and we won't
get incompatible result.

> > I'd like to pause here.
>
> I will submit new versions after your are done with your review.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2017-09-14 07:53:49 Re: Re: [COMMITTERS] pgsql: Use MINVALUE/MAXVALUE instead of UNBOUNDED for range partition b
Previous Message Tsunakawa, Takayuki 2017-09-14 07:23:14 Re: [bug fix] PG10: libpq doesn't connect to alternative hosts when some errors occur