From: | Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru> |
---|---|

To: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> |

Cc: | pgsql-hackers(at)postgresql(dot)org |

Subject: | Re: [PATCH]: fix bug in SP-GiST box_ops |

Date: | 2017-01-30 18:12:31 |

Message-ID: | 4450e7a6-01e7-0fb2-a01e-98fb5405d217@postgrespro.ru |

Views: | Raw Message | Whole Thread | Download mbox | Resend email |

Thread: | |

Lists: | pgsql-hackers |

On 30.01.2017 12:04, Kyotaro HORIGUCHI wrote:

> Hello,

>

> At Mon, 30 Jan 2017 07:12:03 +0300, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru> wrote

> in <9ea5b157-478c-8874-bc9b-050076b7d042(at)postgrespro(dot)ru>:

>> Working on the tests for new SP-GiST opclasses for polygons and

>> circles, I've found a bug in the SP-GiST box_ops (added in 9.6): some operators

>> (&<, &>, $<|, |&>) have wrong tests in spg_box_quad_inner_consistent().

>> This obviously leads to incorrect results of a SP-GiST index scan (see tests

>> in the attached patch, their results were taken from a sequential scan).

>

> Your problem is not necessarily evident for others. Perhaps you

> have to provide an explanation and/or a test case that describes

> what is wrong for you so that others can get a clue for this

> problem. Simpler test is better.

The problem is that there are mixed low/high values for x/y axes. For example,

overLeft4D() compares x-RangeBox rect_box->range_box_x with y-Range &query->right,

and also lower2D() here must use query->high instead of query->low.

The corresponding test is very simple: insert 10000 nearly arbitrary boxes and

see the difference between results of sequential scan and results of index scan.

regression=# CREATE TEMPORARY TABLE quad_box_tbl (b box);

regression=# INSERT INTO quad_box_tbl

SELECT box(point(x * 10, y * 10), point(x * 10 + 5, y * 10 + 5))

FROM generate_series(1, 100) x,

generate_series(1, 100) y;

regression=# CREATE INDEX quad_box_tbl_idx ON quad_box_tbl USING spgist(b);

regression=# SET enable_seqscan = ON;

regression=# SET enable_indexscan = OFF;

regression=# SET enable_bitmapscan = OFF;

regression=# SELECT count(*) FROM quad_box_tbl WHERE b &< box '((100,200),(300,500))';

count

-------

2900

(1 row)

regression=# SELECT count(*) FROM quad_box_tbl WHERE b &> box '((100,200),(300,500))';

count

-------

9100

(1 row)

regression=# SELECT count(*) FROM quad_box_tbl WHERE b &<| box '((100,200),(300,500))';

count

-------

4900

(1 row)

regression=# SELECT count(*) FROM quad_box_tbl WHERE b |&> box '((100,200),(300,500))';

count

-------

8100

(1 row)

regression=# SET enable_seqscan = OFF;

regression=# SET enable_indexscan = ON;

regression=# SET enable_bitmapscan = ON;

regression=# SELECT count(*) FROM quad_box_tbl WHERE b &< box '((100,200),(300,500))';

count

-------

2430

(1 row)

regression=# SELECT count(*) FROM quad_box_tbl WHERE b &> box '((100,200),(300,500))';

count

-------

6208

(1 row)

regression=# SELECT count(*) FROM quad_box_tbl WHERE b &<| box '((100,200),(300,500))';

count

-------

1048

(1 row)

regression=# SELECT count(*) FROM quad_box_tbl WHERE b |&> box '((100,200),(300,500))';

count

-------

5084

(1 row)

> The test,

>

> | +INSERT INTO quad_box_tbl

> | + SELECT i, box(point(x, y), point(x + w, y + h))

> | + FROM (SELECT i,

> | + random() * 1000 as x, random() * 1000 as y,

> | + random() * 20 as w, random() * 20 as h

>

> is inserting quad_boxes generated using random numbers then,

>

> | +SELECT count(*) FROM quad_box_tbl WHERE b << box '((100,200),(300,500))';

> | + count

> | +-------

> | + 891

>

> counting them in this way is unstable. Even though this were

> stable due to a fixed initial, this would be unacceptable, I

> think. This kind of test should use nonrandom numbers.

I have replaced random data in this test with stable uniformly distributed data.

I would also like to use existing box data from rect.data that is loaded to

slow_emp4000 table in copy.sql test, but box.sql test is executed before copy.sql.

> Even though I don't understand this in depth, the following

> change seems somewhat wrong in direction. Changing the second

> argument type seems breaking the basis of the design.

>

> | -lower2D(RangeBox *range_box, Range *query)

> | +lower2D(RangeBox *range_box, double query)

Maybe. I also thought that these changes are quite doubtful, so I decided to

replace lowerEqual2D()/higherEqual2D() with overlower2D()/overhigher2D() and

not to touch lower2D()/higher2D().

--

Nikita Glukhov

Postgres Professional:http://www.postgrespro.com

The Russian Postgres Company

Attachment | Content-Type | Size |
---|---|---|

fix-bug-in-SP-GiST-box_ops-v02.patch | text/x-patch | 7.0 KB |

- Re: [PATCH]: fix bug in SP-GiST box_ops at 2017-01-30 09:04:11 from Kyotaro HORIGUCHI

- Re: [PATCH]: fix bug in SP-GiST box_ops at 2017-01-31 10:04:56 from Kyotaro HORIGUCHI

From | Date | Subject | |
---|---|---|---|

Next Message | Tom Lane | 2017-01-30 18:17:42 | Re: Improvements in psql hooks for variables |

Previous Message | Alvaro Herrera | 2017-01-30 18:10:58 | Re: pgsql: test_pg_dump TAP test whitespace cleanup |