Re: SP-GiST bug.

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SP-GiST bug.
Date: 2014-05-29 14:52:09
Message-ID: 53874999.3010303@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> I don't think that checkAllTheSame is broken, but it probably would be
> after this patch --- ISTM you're breaking the corner case described in the
> header comment for checkAllTheSame, where we need to force splitting of a
> set of existing tuples that the opclass can't distinguish.
INHO, this patch will not break - because it forces (in corner case) to create a
node for new tuple but exclude it from list to insert. On next iteration we will
find a needed node with empty list.

Comment:
* If we know that the leaf tuples wouldn't all fit on one page, then we
* exclude the last tuple (which is the incoming new tuple that forced a split)
* from the check to see if more than one node is used. The reason for this
* is that if the existing tuples are put into only one chain, then even if
* we move them all to an empty page, there would still not be room for the
* new tuple, so we'd get into an infinite loop of picksplit attempts.

If we reserve a node for new tuple then on next iteration we will not fall into
picksplit again - we will have an empty list. So new tuple will fall into
another page.

BTW, look for following example:
create table xxx (p point);

insert into xxx select '(1,1)'::point from generate_series(1, 226) as v;
insert into xxx values ('(3,3)'::point);

create index xxxidx on xxx using spgist (p quad_point_ops);

easy to see that the single node is allTheSame although underlying leafs are
different. Due to allTheSame search logic scans are correct but inefficient. And
patch fixes it too.

> What actually is broken, I think, is the spgist text opclass, because it's
> using a zero node label for two different situations. The scenario we're
> looking at here is:

Agree for different situations, but disagree that's broken :)

> It seems like we have to
> distinguish the case of zero-length string from that of a dummy label
> for a pushed-down allTheSame tuple.

Actually, we do this: if allTheSame is true then nodes have a dummy labels.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2014-05-29 15:23:55 Re: json/jsonb inconsistence - 2
Previous Message Tom Lane 2014-05-29 14:05:03 Re: [PATCH] Replacement for OSSP-UUID for Linux and BSD