Re: GIST and TOAST

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Teodor Sigaev" <teodor(at)sigaev(dot)ru>
Cc: <andrew(at)supernews(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GIST and TOAST
Date: 2007-03-06 17:48:34
Message-ID: 87k5xu9sb1.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


"Teodor Sigaev" <teodor(at)sigaev(dot)ru> writes:

>> And it's not clear _int_gist.c has been running with toasted array values for
>> years because it's limited to arrays of 100 integers (or perhaps 200 integers,
>> there's a factor of 2 in the test). That's not enough to trigger toasting
>> unless there are other large columns in the same table.
>
> That's was intended limitation to prevent indexing of huge arrays.
> gist__int_ops compression method is orientated for small and isn't effective on
> big ones.

Right, so it's possible nobody see any toasted arrays with _int_gist.c since
they never get very large. It looks like index_form_tuple will never compress
anything under 512b so I guess it's safe currently.

>> I do know that with packed varlenas I get a crash in g_int_union among other
>> places. I can't tell where the datum came from originally and how it ended up
>> stored in packed format.
> Can you provide your patch (in current state) and test suite? Or backtrace at least.

It doesn't actually crash, it just fails CHECKARRVALID. I added an assertion
in there to cause it to generate a core dump.

You can download the core dump and binary from

http://community.enterprisedb.com/varlena/core._int
http://community.enterprisedb.com/varlena/postgres._int

The last patch (without the assertion) is at:

http://community.enterprisedb.com/varlena/patch-varvarlena-14.patch.gz

What I'm seeing is this:

(gdb) f 3
#3 0xb7fd924b in inner_int_union (a=0x84e41f4, b=0xb64220d0)
at _int_tool.c:81
81 CHECKARRVALID(b);

The array is actually garbage:

(gdb) p *b
$2 = {vl_len_ = 141, ndim = 0, dataoffset = 5888, elemtype = 0}

What's going on is that the va_1byte header is 141 which is 0x80 | 13. So it's
actually only 13 bytes with a 1 byte header or a 12 byte array:

(gdb) p *(varattrib*)b
$3 = {va_1byte = {va_header = 141 '\215', va_data = ""}, va_external = {
va_header = 141 '\215', va_padding = "\000\000", va_rawsize = 0,
va_extsize = 5888, va_valueid = 0, va_toastrelid = 0}, va_compressed = {
va_header = 141, va_rawsize = 0, va_data = ""}, va_4byte = {
va_header = 141, va_data = ""}}

(gdb) bt
#0 0xb7e6a947 in raise () from /lib/tls/libc.so.6
#1 0xb7e6c0c9 in abort () from /lib/tls/libc.so.6
#2 0x082fec97 in ExceptionalCondition (
conditionName=0xb7fdd3b9 "!(!((b)->dataoffset != 0))",
errorType=0xb7fdd371 "FailedAssertion",
fileName=0xb7fdd347 "_int_tool.c", lineNumber=81) at assert.c:51
#3 0xb7fd924b in inner_int_union (a=0x84e41f4, b=0xb64220d0)
at _int_tool.c:81
#4 0xb7fd547d in g_int_picksplit (fcinfo=0xbf9e43f0) at _int_gist.c:403
#5 0x08304d9c in FunctionCall2 (flinfo=0xbf9e5a94, arg1=139342312,
arg2=3214821160) at fmgr.c:1154
#6 0x08094fd3 in gistUserPicksplit (r=0xb6078d4c, entryvec=0x84e31e8,
attno=0, v=0xbf9e4728, itup=0x84e2ddc, len=142, giststate=0xbf9e4b94)
at gistsplit.c:306
#7 0x08095deb in gistSplitByKey (r=0xb6078d4c, page=0xb6420220 "",
itup=0x84e2ddc, len=142, giststate=0xbf9e4b94, v=0xbf9e4728,
entryvec=0x84e31e8, attno=0) at gistsplit.c:548
#8 0x080874bd in gistSplit (r=0xb6078d4c, page=0xb6420220 "",
itup=0x84e2ddc, len=142, giststate=0xbf9e4b94) at gist.c:943
#9 0x080850fa in gistplacetopage (state=0xbf9e49e0, giststate=0xbf9e4b94)
at gist.c:329
#10 0x080871eb in gistmakedeal (state=0xbf9e49e0, giststate=0xbf9e4b94)
at gist.c:873
#11 0x08084f21 in gistdoinsert (r=0xb6078d4c, itup=0x84e2ce4, freespace=819,
giststate=0xbf9e4b94) at gist.c:278
#12 0x08084cf5 in gistbuildCallback (index=0xb6078d4c, htup=0x84c8c30,
values=0xbf9e4a98, isnull=0xbf9e4a78 "", tupleIsAlive=1 '\001',
state=0xbf9e4b94) at gist.c:201
#13 0x080fc81f in IndexBuildHeapScan (heapRelation=0xb60d6860,
indexRelation=0xb6078d4c, indexInfo=0x84cd620,
callback=0x8084c27 <gistbuildCallback>, callback_state=0xbf9e4b94)
at index.c:1548
#14 0x08084bdd in gistbuild (fcinfo=0xbf9e60e8) at gist.c:150
#15 0x08305630 in OidFunctionCall3 (functionId=782, arg1=3054332000,
arg2=3053948236, arg3=139253280) at fmgr.c:1460
#16 0x080fc363 in index_build (heapRelation=0xb60d6860,
indexRelation=0xb6078d4c, indexInfo=0x84cd620, isprimary=0 '\0')
at index.c:1296
#17 0x080fb86a in index_create (heapRelationId=21361,
indexRelationName=0x84a531c "text_idx", indexRelationId=21366,
indexInfo=0x84cd620, accessMethodObjectId=783, tableSpaceId=0,
classObjectId=0x84cd60c, coloptions=0x84cd6ac, reloptions=0,
isprimary=0 '\0', isconstraint=0 '\0', allow_system_table_mods=0 '\0',
skip_build=0 '\0', concurrent=0 '\0') at index.c:794
#18 0x0815f3e4 in DefineIndex (heapRelation=0x84a5354,
indexRelationName=0x84a531c "text_idx", indexRelationId=0,
accessMethodName=0x84a5380 "gist", tableSpaceName=0x0,
attributeList=0x84a5448, predicate=0x0, rangetable=0x0, options=0x0,
unique=0 '\0', primary=0 '\0', isconstraint=0 '\0',
is_alter_table=0 '\0', check_rights=1 '\001', skip_build=0 '\0',
quiet=0 '\0', concurrent=0 '\0') at indexcmds.c:452
#19 0x0825dcea in ProcessUtility (parsetree=0x84a5464, params=0x0,
dest=0x84a54e0, completionTag=0xbf9e687e "") at utility.c:797
#20 0x0825bef9 in PortalRunUtility (portal=0x84ca2ec, utilityStmt=0x84a5464,
dest=0x84a54e0, completionTag=0xbf9e687e "") at pquery.c:1176
#21 0x0825c04b in PortalRunMulti (portal=0x84ca2ec, dest=0x84a54e0,
altdest=0x84a54e0, completionTag=0xbf9e687e "") at pquery.c:1263
#22 0x0825b786 in PortalRun (portal=0x84ca2ec, count=2147483647,
dest=0x84a54e0, altdest=0x84a54e0, completionTag=0xbf9e687e "")
at pquery.c:814
#23 0x0825604d in exec_simple_query (
query_string=0x84a4fec "CREATE INDEX text_idx on test__int using gist ( a gist__int_ops );") at postgres.c:953
#24 0x08259c2f in PostgresMain (argc=4, argv=0x844df58,
username=0x844de44 "stark") at postgres.c:3434
#25 0x08223d2c in BackendRun (port=0x8461550) at postmaster.c:2974
#26 0x082232b9 in BackendStartup (port=0x8461550) at postmaster.c:2601
#27 0x08220d82 in ServerLoop () at postmaster.c:1214
#28 0x08220728 in PostmasterMain (argc=3, argv=0x844a340) at postmaster.c:967
#29 0x081c426b in main (argc=3, argv=0x844a340) at main.c:188

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2007-03-06 17:56:03 Re: Bug: Buffer cache is not scan resistant
Previous Message Tom Lane 2007-03-06 17:47:38 Re: Calculated view fields (8.1 != 8.2)