Re: PosgreSQL is crashing with a signal 11 - Bug?

From: Kjetil Torgrim Homme <kjetilho(at)ifi(dot)uio(dot)no>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Re: PosgreSQL is crashing with a signal 11 - Bug?
Date: 2004-09-10 12:43:41
Message-ID: 1rbrge70qq.fsf@rovereto.ifi.uio.no
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

we got a new coredump of 7.3.7 today. this instance was running on a
freshly installed computer, to eliminate(?) all hardware issues. it's
still the same brand and model, though. the old system has been
running hard disk tests 30+ hours with no errors yet.

the core dump happens at the same place in the code, and this time we
got a complete backtrace:

(gdb) bt
#0 0xb734d07c in memcpy () from /lib/tls/libc.so.6
#1 0x0806bba8 in DataFill (data=0xb7488fff "", tupleDesc=0x82899a0,
value=0x8289980, nulls=0xbfffd3c0 " n \"", infomask=0x8806b04c,
bit=0x8806b04f "ï\001") at heaptuple.c:139
#2 0x0806c3ee in heap_formtuple (tupleDescriptor=0x8279ec0, value=0x8289980,
nulls=0xbfffd3c0 " n \"") at heaptuple.c:623
#3 0x080d1af1 in ExecTargetList (targetlist=0x8278298, nodomains=9,
targettype=0x8279ec0, values=0x8289980, econtext=0x8279a60,
isDone=0xbfffd468) at execQual.c:2230
#4 0x080d1cdb in ExecScan (node=0x827a208, accessMtd=0xbfffd468)
at execScan.c:49
#5 0x080d1d7d in ExecScan (node=0x8278c70, accessMtd=0x80d7c58 <SeqNext+24>)
at execScan.c:146
#6 0x080d7cfb in InitScanRelation (node=0x82899a0, estate=0x8278c70,
scanstate=0xbfffd4c8) at nodeSeqscan.c:162
#7 0x080cfd86 in ExecProcNode (node=0x8289bf8, parent=0x0)
at execProcnode.c:315
#8 0x080cecf3 in ExecutePlan (estate=0x8279c90, plan=0x8278c70,
operation=CMD_SELECT, numberTuples=0, direction=136878496,
destfunc=0x82899c8) at execMain.c:964
#9 0x080ce392 in ExecutorEnd (queryDesc=0x82899a0, estate=0x0)
at execMain.c:223
#10 0x0811d069 in ProcessQuery (parsetree=0x82899c8, plan=0x8278c70,
dest=Remote, completionTag=0xbfffd610 "") at pquery.c:251
#11 0x0811b7ed in pg_exec_query_string (query_string=0xbfffd610, dest=Remote,
parse_context=0x823d610) at postgres.c:844
#12 0x0811c64d in PostgresMain (argc=4, argv=0xbfffd850,
username=0x8238c69 "cerebrum") at postgres.c:2018
#13 0x0810413d in DoBackend (port=0x8238b38) at postmaster.c:2304
#14 0x08103cb2 in BackendStartup (port=0x8238b38) at postmaster.c:1935
#15 0x08102dad in ServerLoop () at postmaster.c:1016
#16 0x081027ea in PostmasterMain (argc=1, argv=0x8220170) at postmaster.c:797
#17 0x080e1234 in main (argc=1, argv=0xbfffe204) at main.c:217

(gdb) print *att[i]
$20 = {attrelid = 0, attname = {
data = "pageunits_total", '\0' <repeats 48 times>,
alignmentDummy = 1701273968}, atttypid = 1700, attstattarget = -1,
attlen = -1, attnum = 9, attndims = 0, attcacheoff = -1, atttypmod = 393220,
attbyval = 0 '\0', attstorage = 109 'm', attisset = 0 '\0',
attalign = 105 'i', attnotnull = 0 '\0', atthasdef = 0 '\0',
attisdropped = 0 '\0', attislocal = 1 '\001', attinhcount = 0}
(gdb) print i
$21 = 8
(gdb) x/10 value[i]
0xb7190928: 0x2f00000b 0x00000000 0x00200000 0x00000207
0xb7190938: 0x00000314 0x01bf913d 0x10120000 0x00090020
0xb7190948: 0xef201553 0x00000001

the relevant code again is:

if (att[i]->attbyval)
[...]
else if (att[i]->attlen == -1)
[...]
else if (att[i]->attlen == -2)
[...]
else
{
/* fixed-length pass-by-reference */
Assert(att[i]->attlen > 0);
data_length = att[i]->attlen;
===> memcpy(data, DatumGetPointer(value[i]), data_length);
}

(gdb) print data_length
$25 = 788529163
(gdb) print att[i]->attlen
$26 = -1

how can att[i]->attlen possibly change in the interim? but
data_length looks corrupted, too.

(gdb) print *att[i-1]
$27 = {attrelid = 0, attname = {
data = "pageunits_paid", '\0' <repeats 49 times>,
alignmentDummy = 1701273968}, atttypid = 1700, attstattarget = -1,
attlen = -1, attnum = 8, attndims = 0, attcacheoff = -1, atttypmod = 393220,
attbyval = 0 '\0', attstorage = 109 'm', attisset = 0 '\0',
attalign = 105 'i', attnotnull = 0 '\0', atthasdef = 0 '\0',
attisdropped = 0 '\0', attislocal = 1 '\001', attinhcount = 0}

also:

(gdb) print data
$39 = 0xb7488fff ""

which doesn't seem very aligned for an integer.

(gdb) print data[1]
Cannot access memory at address 0xb7489000

thank you for any insights.
--
Kjetil T.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2004-09-10 14:24:00 Re: PosgreSQL is crashing with a signal 11 - Bug?
Previous Message ymp 2004-09-10 10:09:11 Installation failed on Win2k