From: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|

To: | emre(at)hasegeli(dot)com |

Cc: | tgl(at)sss(dot)pgh(dot)pa(dot)us, alvherre(at)alvh(dot)no-ip(dot)org, robertmhaas(at)gmail(dot)com, a(dot)alekseev(at)postgrespro(dot)ru, pgsql-hackers(at)postgresql(dot)org |

Subject: | Re: [HACKERS] [PATCH] Improve geometric types |

Date: | 2018-01-12 08:16:56 |

Message-ID: | 20180112.171656.119920237.horiguchi.kyotaro@lab.ntt.co.jp |

Views: | Raw Message | Whole Thread | Download mbox |

Thread: | |

Lists: | pgsql-hackers |

Hello, sorry for the absense.

(Sorry for the long but should be hard-to-read article..)

At Thu, 4 Jan 2018 14:53:47 +0100, Emre Hasegeli <emre(at)hasegeli(dot)com> wrote in <CAE2gYzz0hRfC-M8KDV4Aytj+6k6cvoiGqVvEjVyHTbtraf=YBQ(at)mail(dot)gmail(dot)com>

> Rebased versions are attached.

Thank you for the new patch.

I'm looking 0001 patch. This does many things at once but seems

hard to split into step-by-step patches. So I reviewed this from

the viewpoint that how each of the external function is

changed. (Attatchment 1).

I think that this patch is not intending to change any behaviors

of the external functions, but intending make it mathematically

reasonable. So some behavioral changes are ineviablly

requried. The problem here is what is the most reasonable (and

acceptable) behavior.

The comments below are just the starting of a discussion about

some aspects of design. I'm sorry that this discussion might be

changing the way to go largily, but I'd like to hear what you

think about two topics.

I found seemingly inconsistent handling of NaN.

- Old box_same assumed NaN != NaN, but new assumes NaN ==

NaN. I'm not sure how the difference is significant out of the

context of sorting, though...

- box_overlap()'s behavior has not been changed. As the result

box_same and box_overlap now behaves differently concerning

NaN handling.

- line_construct_pts has been changed so that generates

{NaN,-1,NaN} for two identical points (old code generated

{-1,0,0}) but I think the right behavior here is erroring out.

(as line_in behaves.)

I feel that it'd better to define how to handle non-numbers

before refactoring. I prefer to just denying NaN as a part of

the geometric types, or any input containing NaN just yields a

result filled with NaNs.

The next point is reasonability of the algorithm and consistency

among related functions.

- line_parallel considers the EPSILON(ugaa) with different scale

from the old code, but both don't seem reasonable.. It might

not be the time to touch it, but if we changed the behavior,

I'd like to choose reasonable one. At least the tolerance of

parallelity should be taken by rotation, not by slope.

I think that one reasonable way to do this is taking the distance

between the far ends of two direction (unit) vectors.

Assuming Ax + By + C = 0, the direction vecotr dv(x, y) for the

line is:

n = sqrt(A^2 + B^2), dv = (B / n, -A / n).

(and the normal vector nv = (A / n, B / n))

The vector needs to be restricted within upper or any two

successive quadrants so that it is usable for this objective. So

let's restrict A to be negative value for now. Something like the

follwoing would be an answer.

void line_direction_vector(Point *result, LINE *l)

{

float8 A = l->A;

float8 B = l->B;

float8 n;

n = HYPOT(A, B);

/* keep the result vector within the 1st-2nd quadrants */

if (A > 0)

{

A = -A;

B = -B;

}

point_construct(result, B / n, -A / n);

}

bool line_parallel(LINE *l1, LINE *l2)

{

Point d1, d2;

line_direction_vector(&d1, l1);

line_direction_vector(&d2, l2);

return FPzero(point_dt(&d1, &d2));

}

Also perpendicularity is detected by comparing the direction

vector of one line and the normal vector of another in the same

manner.

void line_normal_vector(Point *result, LINE *l)

{

float8 A = l->A;

float8 B = l->B;

float8 n;

/* keep the result vector within the 1st-2nd quadrants */

n = HYPOT(A, B);

if (B < 0)

{

A = -A;

B = -B;

}

point_construct(result, A / n, B / n);

}

bool line_perp(LINE *l1, LINE *l2)

{

Point d1, d2;

line_direction_vector(&d1, l1);

line_normal_vector(&d2, l2);

return FPzero(point_dt(&d1, &d2));

}

In relation to this, equality of two lines is also nuisance.

From the definitions, two equal lines are parallel. In

contraposition, two nonparallel lines cannot be equal. This

relationship is represented by the following expression:

is_equal =: line_parallel(l1, l2) && FPzero(line_distance(l1, l2))

But the line_distance returns zero in most cases even if it is

considered "parallel" with considering tolerance. This means that

There's almost no two lines that are parallel but not equal (that

is, the truely parallel lines)... sigh.

So we might should calculate the distance in different (not

purely mathematical/geometrical) way. For example, if we can

assume the geometric objects are placed mainly around the origin,

we could take the distance between the points nearest to origin

on both lines. (In other words, between the foots of

perpendicular lines from the origin onto the both lines). Of

couse this is not ideal but...

The same discussion also affects line_interpt_line or other

similar functions.

At last, just a simple comment.

- point_eq_point() assumes that NaN == NaN. This is an inherited

behavior from old float[48]_cmp_internal() but it's not a

common treat. point_eq_point() needs any comment about the

special definition as float[48]_cmp_internal has.

regards,

--

Kyotaro Horiguchi

NTT Open Source Software Center

- Re: [HACKERS] [PATCH] Improve geometric types at 2018-01-04 13:53:47 from Emre Hasegeli

- Re: [HACKERS] [PATCH] Improve geometric types at 2018-01-14 12:20:57 from Emre Hasegeli

From | Date | Subject | |
---|---|---|---|

Next Message | Marina Polyakova | 2018-01-12 08:37:12 | Re: master make check fails on Solaris 10 |

Previous Message | Fabien COELHO | 2018-01-12 07:53:33 | Re: bytea bitwise logical operations implementation (xor / and / or / not) |