Re: Curious buildfarm failures (fwd)

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Sergey Koposov <koposov(at)ast(dot)cam(dot)ac(dot)uk>, pgsql-hackers(at)postgreSQL(dot)org, Andrew Dunstan <andrew(at)dunslane(dot)net>
Subject: Re: Curious buildfarm failures (fwd)
Date: 2013-01-16 01:34:52
Message-ID: 20130116013451.GF3089@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-01-16 02:13:26 +0100, Andres Freund wrote:
> On 2013-01-15 19:56:52 -0500, Tom Lane wrote:
> > Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > > FWIW its also triggerable if two other function calls are places inside
> > > the above if() (I tried fprintf(stderr, "argh") and kill(0, 0)).
> >
> > [ confused... ] You mean replacing the abort() in the elog macro with
> > one of these functions? Or something else?
>
> I mean replacing the elog(ERROR, "ForwardFsyncRequest must...") with any
> two function calls inside a do/while(0). I just tried to place some
> random functions there instead of the elog to make sure its unrelated,
> and it still triggers the problem even before the elog commit. The
> assembler output of that function changes wildly with tiny changes and I
> don't understand IA-64 at all (does anybody?), so I don't see anything
> we can do there.
>
> > > It seems the change just made an existing issue visible.
> > > No idea what to do about it.
> >
> > Pretty clearly a compiler bug at this point. Since there doesn't seem
> > to be a clean workaround (no, I don't want to expand the struct
> > assignment manually), and anyway we can't be sure that the bug doesn't
> > also manifest in other places, recommending Sergey update his compiler
> > seems like the thing to do.
>
> Yea. Don't have a better suggestion.
>
> > At this point I'm more interested in his report in
> > <alpine(dot)LRH(dot)2(dot)03(dot)1301152012220(dot)773(at)ast(dot)cam(dot)ac(dot)uk> about
> > the Assert at spgdoinsert.c:1222 failing. That's pretty new code, so
> > more likely to have a genuine bug, and I wonder if it's related to
> > the spgist issue in <50EBF992(dot)2000704(at)qunar(dot)com> ...
>
> Yes, it looks more like it could be something real. There are
> suspicously many other failing tests though (misc, with) that don't seem
> to be related to the spgist crash.

#3 0x4000000000b5c710 in ExceptionalCondition (
conditionName=0x4000000000c76d50 "!(( ((void) ((bool) ((! assert_enabled) || ! (!(((bool) (((const void*)(&nodes[n]->t_tid) != ((void *)0)) && ((&nodes[n]->t_tid)->ip_posid != 0))))) || (ExceptionalCondition(\"!(((bool) (((const void*)"...,
errorType=0x4000000000c4c5a0 "FailedAssertion", fileName=0x4000000000c75d30 "spgdoinsert.c", lineNumber=1222) at assert.c:54
#4 0x40000000001a6320 in doPickSplit (index=0x600000000007ff48, state=0x3, current=0x60000ffffff7a700, parent=0x4, newLeafTuple=0x6,
level=512360, isNulls=64 '@', isNew=12 '\f') at spgdoinsert.c:1222
#5 0x40000000001a12d0 in spgdoinsert (index=0x2000000009856028, state=0x60000ffffff7a9d0, heapPtr=0x60000000001e6e7c,
datum=6917546619826579712, isnull=0 '\0') at spgdoinsert.c:1996
#6 0x4000000000195870 in spginsert (fcinfo=0x60000ffffff7a9d0) at spginsert.c:222
#7 0x4000000000b77dd0 in FunctionCall6Coll (flinfo=0x6000000000102018, collation=0, arg1=2305843009373429800, arg2=6917546619826580944,
arg3=6917546619826581200, arg4=6917529027643076220, arg5=2305843009373166576, arg6=0) at fmgr.c:1439
#8 0x4000000000148b70 in index_insert (indexRelation=0x2000000009856028, values=0x60000ffffff7add0, isnull=0x60000ffffff7aed0 "",
heap_t_ctid=0x60000000001e6e7c, heapRelation=0x2000000009815bf0, checkUnique=UNIQUE_CHECK_NO) at indexam.c:216
#9 0x40000000004e99f0 in ExecInsertIndexTuples (slot=0x60000000001e55c0, tupleid=0x60000000001e6e7c, estate=0x60000000001e4f18)
at execUtils.c:1088
#10 0x4000000000516710 in ExecModifyTable (node=0x0) at nodeModifyTable.c:249
#11 0x40000000004c6350 in $$1$3_0$TAG$0ca$0$3 () at execProcnode.c:377
#12 0x40000000004bba00 in ExecutorRun (queryDesc=0x60000000001e4fb0, direction=NoMovementScanDirection, count=0) at execMain.c:1400
#13 0x40000000008493f0 in PortalRunMulti (portal=0x60000000000ff7f8, isTopLevel=-26 '�', dest=0x60000000001ef658,
altdest=0x60000000001ef658, completionTag=0x60000ffffff7b2d0 "") at pquery.c:185
#14 0x4000000000848d20 in _setjmp_lpad_PortalRun_1$0$13 () at pquery.c:814
#15 0x4000000000840c60 in exec_simple_query (
query_string=0x600000000018d4f8 "insert into test_range_spgist select 'empty'::int4range from generate_series(1,500) g;")
at postgres.c:1048
#16 0x40000000008370a0 in _setjmp_lpad_PostgresMain_0$0$51 () at postgres.c:3969
---Type <return> to continue, or q <return> to quit---
#17 0x4000000000720240 in BackendStartup (port=0x60000000000fc950) at postmaster.c:3989
#18 0x400000000071dc80 in ServerLoop () at postmaster.c:1575
#19 0x400000000071a700 in PostmasterMain (argc=9, argv=0x60000000000dc300) at postmaster.c:1244
#20 0x40000000005796d0 in main (argc=9, argv=0x60000000000dc010) at main.c:197

#4 0x40000000001a6320 in doPickSplit (index=0x600000000007ff48, state=0x3, current=0x60000ffffff7a700, parent=0x4, newLeafTuple=0x6,
level=512360, isNulls=64 '@', isNew=12 '\f') at spgdoinsert.c:1222
1222 Assert(ItemPointerGetBlockNumber(&nodes[n]->t_tid) == leafBlock);

(gdb) info locals
in = {nTuples = 227, datums = 0x6000000000205060, level = 1}
out = {hasPrefix = 0 '\0', prefixDatum = 0, nNodes = 8, nodeLabels = 0x0, mapTuplesToNodes = 0x6000000000209018,
leafTupleDatums = 0x6000000000209430}
includeNew = 1 '\001'
startOffsets = {4, 0}
rdata = {{data = 0x60000ffffff7a860 "\177\006", len = 56, buffer = 0, buffer_std = 1 '\001', next = 0x60000ffffff7a720}, {
data = 0x6000000000206f08 "D", len = 72, buffer = 0, buffer_std = 1 '\001', next = 0x60000ffffff7a740}, {data = 0x6000000000206090 "3",
len = 456, buffer = 0, buffer_std = 1 '\001', next = 0x60000ffffff7a760}, {data = 0x0, len = 0, buffer = 395, buffer_std = 1 '\001',
next = 0x0}, {data = 0x2dffe98 <Address 0x2dffe98 out of bounds>, len = 0, buffer = 0, buffer_std = 88 'X', next = 0xa020}, {
data = 0x60000000001a6f80 "A", len = 32, buffer = 16842752, buffer_std = 2 '\002', next = 0x60000ffffff7a7f0}, {
data = 0xffffffff00000100 <Address 0xffffffff00000100 out of bounds>, len = 65598, buffer = 0, buffer_std = -119 '\211', next = 0x164},
{data = 0x60000ffffff7a7f0 "p\200�\002", len = 0, buffer = 0, buffer_std = 112 'p', next = 0x30}, {
data = 0x1 <Address 0x1 out of bounds>, len = 4294420496, buffer = 1610616831, buffer_std = -128 '\200', next = 0x20}, {
data = 0x1 <Address 0x1 out of bounds>, len = 4294420576, buffer = 1610616831, buffer_std = 0 '\0', next = 0x68}}
xlrec = {node = {spcNode = 1663, dbNode = 12030, relNode = 40992}, blknoSrc = 32, blknoDest = 4294967295, nDelete = 226, nInsert = 0,
initSrc = 0 '\0', initDest = 1 '\001', blknoInner = 1610616831, offnumInner = 20520, initInner = 0 '\0', storesNulls = 0 '\0',
blknoParent = 1610612736, offnumParent = 32, nodeI = 0, stateSrc = {myXid = 1516, isBuild = 0 '\0'}}
saveCurrent = {blkno = 395, buffer = 0, page = 0x0, offnum = 43104, node = 1610616831}

(gdb) p parent
$5 = (SPPageDesc *) 0x4

#5 0x40000000001a12d0 in spgdoinsert (index=0x2000000009856028, state=0x60000ffffff7a9d0, heapPtr=0x60000000001e6e7c,
datum=6917546619826579712, isnull=0 '\0') at spgdoinsert.c:1996
1996 if (doPickSplit(index, state, &current, &parent,
(gdb) p parent
$4 = {blkno = 1, buffer = 356, page = 0x200000000148eea0 "", offnum = 1, node = 4}

(gdb) p &parent
$7 = (SPPageDesc *) 0x60000ffffff7a900

Looks like some out of bounds access?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Janes 2013-01-16 01:35:15 string escaping in tutorial/syscat.source
Previous Message Tom Lane 2013-01-16 01:32:00 Re: Curious buildfarm failures (fwd)