Re: issues with range types, btree_gist and constraints

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: pgsql-hackers(at)postgresql(dot)org, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Subject: Re: issues with range types, btree_gist and constraints
Date: 2013-02-05 23:09:36
Message-ID: 11942.1360105776@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tomas Vondra <tv(at)fuzzy(dot)cz> writes:
> I've managed to further simplify the test-case, and I've verified that
> it's reproducible on current 9.2 and 9.3 branches.

I'm not sure why you're getting unstable behavior --- it's pretty stable
for me, at least in a debug build. What I'm finding is that gistsplit.c
appears to contain multiple bugs. If you just insert the first 271 rows
of sample-1.csv, the 271'st row results in a gist page split. The split
results in two pages having these root-page downlink entries:

Item 2 -- Length: 88 Offset: 8000 (0x1f40) Flags: NORMAL
Block Id: 2 linp Index: 65535 Size: 88
Has Nulls: 0 Has Varwidths: 1

1f40: 00000200 ffff5840 93900000 00373637 (dot)(dot)(dot)(dot)(dot)(dot)X(at)(dot)(dot)(dot)(dot)(dot)767
1f50: 61626536 31313961 30313834 34366263 abe6119a018446bc
1f60: 64373632 37666638 61346632 64900000 d7627ff8a4f2d...
1f70: 00626465 61613234 38323966 31626365 .bdeaa24829f1bce
1f80: 36343561 39313463 38386665 61613739 645a914c88feaa79
1f90: 340d440f 00001800 4.D.....

(ie, range 767abe6119a018446bcd7627ff8a4f2d..bdeaa24829f1bce645a914c88feaa794)

Item 3 -- Length: 72 Offset: 7928 (0x1ef8) Flags: NORMAL
Block Id: 3 linp Index: 65535 Size: 72
Has Nulls: 0 Has Varwidths: 1

1ef8: 00000300 ffff4840 73900000 00626538 (dot)(dot)(dot)(dot)(dot)(dot)H(at)s(dot)(dot)(dot)(dot)be8
1f08: 30333261 33656138 31326565 62323838 032a3ea812eeb288
1f18: 64663361 65636161 30666361 34500000 df3aecaa0fca4P..
1f28: 0053696d 706c6554 65737453 7472696e .SimpleTestStrin
1f38: 670d440f 00001800 g.D.....

(ie, range be8032a3ea812eeb288df3aecaa0fca4..SimpleTestString)

which looks fine .. but the leaf entry for SimpleTestString went into
the first of these two pages, which is 100% wrong.

The split produced by contrib/btree_gist is perfectly fine, and it
assigns SimpleTestString to the right-hand page as expected. The
bug is being introduced in lines 435..480 of gistsplit.c, which
is code that doesn't fire unless we're working on a non-last
column of the index (which is why you couldn't reproduce this in
a 1-column index). It seems that

(1) gistfindgroup decides that SimpleTestString is "equiv" to something.
It's not too clear what; for sure there is no other leaf key of the same
value. The code in gistfindgroup looks a bit like it ought to think
that all the items are "equiv" to one of the union datums or the other,
but that's not the case --- no other item gets marked. (After further
investigation it seems that gbt_var_penalty is returning a negative
penalty at the critical spot here, which might have something to
do with it --- the business with the tmp[4] array seems many degrees
away from trustworthy.)

(2) cleanupOffsets removes the item for SimpleTestString from
sv->spl_right[], evidently because it's "equiv".

(3) placeOne() sticks the entry into the spl_left[] array, which is
an absolutely horrid decision: it means the left-side page will now
need a range spanning the right-side page's range, when a moment
ago they had been disjoint.

(4) gistunionsubkey(), which apparently is supposed to fix the union
datums to reflect this rearrangement, does no such thing because it
only processes the index columns to the right of the one where the
damage has been done. (It's not clear to me that it could ever
be allowed to skip any columns, but that's what it's doing.)

If I don't hear a commitment from Teodor to fix this, I'm going to
rip out pretty much all of this logic as being under-documented
and overly-buggy.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2013-02-05 23:38:42 Re: split rm_name and rm_desc out of rmgr.c
Previous Message Tomas Vondra 2013-02-05 22:31:21 Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system