more crashes

From: Alfred Perlstein <bright(at)wintelcom(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: more crashes
Date: 2000-10-02 22:17:12
Message-ID: 20001002151712.Y27736@fw.wintelcom.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

This time I'm pretty sure I caught the initial crash during an update:

I disabled the vacuum analyze and still got table corruption with a crash:

two crashdumps of 7.0.2+somepatches

* $Header: /home/pgcvs/pgsql/src/backend/access/common/heaptuple.c,v 1.6
2 2000/04/12 17:14:36 momjian Exp $

Program terminated with signal 11, Segmentation fault.
Reading symbols from /usr/lib/libcrypt.so.2...done.
Reading symbols from /usr/lib/libm.so.2...done.
Reading symbols from /usr/lib/libutil.so.3...done.
Reading symbols from /usr/lib/libreadline.so.4...done.
Reading symbols from /usr/lib/libncurses.so.5...done.
Reading symbols from /usr/lib/libc.so.4...done.
Reading symbols from /usr/libexec/ld-elf.so.1...done.
#0 0x8063aa7 in nocachegetattr (tuple=0x84ae9fc, attnum=4,
tupleDesc=0x84a6368, isnull=0x84afc20 "") at heaptuple.c:537
537 off = att_addlength(off, att[i]->attlen, tp + off);
(gdb) bt
#0 0x8063aa7 in nocachegetattr (tuple=0x84ae9fc, attnum=4,
tupleDesc=0x84a6368, isnull=0x84afc20 "") at heaptuple.c:537
#1 0x80a027f in ExecEvalVar (variable=0x84974b0, econtext=0x84aedd8,
isNull=0x84afc20 "") at execQual.c:314
#2 0x80a0d97 in ExecEvalExpr (expression=0x84974b0, econtext=0x84aedd8,
isNull=0x84afc20 "", isDone=0xbfbfe6db "\001ØíJ\b+ù\021\b\004èJ\b\001")
at execQual.c:1214
#3 0x80a090a in ExecEvalFuncArgs (fcache=0x84afc38, econtext=0x84aedd8,
argList=0x84974d8, argV=0xbfbfe6dc,
argIsDone=0xbfbfe6db "\001ØíJ\b+ù\021\b\004èJ\b\001") at execQual.c:635
#4 0x80a09c1 in ExecMakeFunctionResult (node=0x8496a40, arguments=0x84974d8,
econtext=0x84aedd8, isNull=0xbfbfe7db "",
isDone=0xbfbfe75b "\b\214ç¿¿\027\016\n\bHuI\bØíJ\bÛç¿¿X\017B`\004")
at execQual.c:711
#5 0x80a0b37 in ExecEvalOper (opClause=0x8497548, econtext=0x84aedd8,
isNull=0xbfbfe7db "") at execQual.c:902
#6 0x80a0e17 in ExecEvalExpr (expression=0x8497548, econtext=0x84aedd8,
isNull=0xbfbfe7db "", isDone=0xbfbfe7e0 "\001É\016\b") at execQual.c:1249
#7 0x80a1011 in ExecTargetList (targetlist=0x8497fd8, nodomains=6,
targettype=0x84aefb0, values=0x84aee48, econtext=0x84aedd8,
isDone=0xbfbfe90b "\001,é¿¿.K\n\bPÝJ\b\214H\n\b<é¿¿çA\023\b\030ÀH\b ")
at execQual.c:1511
#8 0x80a12af in ExecProject (projInfo=0x84aee20,
isDone=0xbfbfe90b "\001,é¿¿.K\n\bPÝJ\b\214H\n\b<é¿¿çA\023\b\030ÀH\b ")
at execQual.c:1721
#9 0x80a1365 in ExecScan (node=0x84add50, accessMtd=0x80a488c <IndexNext>)
at execScan.c:155
#10 0x80a4b2e in ExecIndexScan (node=0x84add50) at nodeIndexscan.c:288
#11 0x809fb6d in ExecProcNode (node=0x84add50, parent=0x84add50)
at execProcnode.c:272
#12 0x809ed59 in ExecutePlan (estate=0x84ae8a0, plan=0x84add50,
operation=CMD_UPDATE, offsetTuples=0, numberTuples=0,
direction=ForwardScanDirection, destfunc=0x84afaf0) at execMain.c:1052
#13 0x809e2ba in ExecutorRun (queryDesc=0x84ae888, estate=0x84ae8a0,
feature=3, limoffset=0x0, limcount=0x0) at execMain.c:327
#14 0x80f92ca in ProcessQueryDesc (queryDesc=0x84ae888, limoffset=0x0,
limcount=0x0) at pquery.c:310
#15 0x80f9347 in ProcessQuery (parsetree=0x84965d0, plan=0x84add50,
dest=Remote) at pquery.c:353
#16 0x80f7ef0 in pg_exec_query_dest (
query_string=0x81a9370 "\nUPDATE\n webhit_details_formatted\nSET\n attr_hits = attr_hits + '1' \nWHERE\n counter_id = '11909'\n AND attr_type = 'ATTR_OPERATINGSYS'\n AND attr_name = 'win95'\n AND attr_vers = '0'\n;",
dest=Remote, aclOverride=0) at postgres.c:663
#17 0x80f7db9 in pg_exec_query (
query_string=0x81a9370 "\nUPDATE\n webhit_details_formatted\nSET\n attr_hits = attr_hits + '1' \nWHERE\n counter_id = '11909'\n AND attr_type = 'ATTR_OPERATINGSYS'\n AND attr_name = 'win95'\n AND attr_vers = '0'\n;")
at postgres.c:562
#18 0x80f8d1a in PostgresMain (argc=9, argv=0xbfbff0dc, real_argc=10,
real_argv=0xbfbffb3c) at postgres.c:1590
#19 0x80e1d06 in DoBackend (port=0x843f400) at postmaster.c:2009
#20 0x80e1899 in BackendStartup (port=0x843f400) at postmaster.c:1776
#21 0x80e0abd in ServerLoop () at postmaster.c:1037
#22 0x80e04be in PostmasterMain (argc=10, argv=0xbfbffb3c) at postmaster.c:725
#23 0x80aee43 in main (argc=10, argv=0xbfbffb3c) at main.c:93
#24 0x80633c5 in _start ()
(gdb) list
532
533 if (usecache)
534 att[i]->attcacheoff = off;
535 }
536
537 off = att_addlength(off, att[i]->attlen, tp + off);
538
539 if (usecache &&
540 att[i]->attlen == -1 && !VARLENA_FIXED_SIZE(att[i]))
541 usecache = false;
(gdb) print off
$1 = 772814392
(gdb) print att[i]->attlen
$2 = -1
(gdb) print off
$3 = 772814392
(gdb) print tp
$4 = 0x5eab73d0 "\205."
(gdb) print tp+off
$7 = 0x8cbbaa08 <Address 0x8cbbaa08 out of bounds>
(gdb) print usecache
$8 = 0 '\000'
(gdb) print !VARLENA_FIXED_SIZE(att[i])
No symbol "VARLENA_FIXED_SIZE" in current context.
(gdb) print att[i]
$9 = 0x84a66c8
(gdb) print *(att[i])
$10 = {attrelid = 3518994475, attname = {
data = "attr_vers", '\000' <repeats 22 times>,
alignmentDummy = 1920234593}, atttypid = 1043,
attdisbursion = 0.125293151, attlen = -1, attnum = 4, attnelems = 0,
attcacheoff = -1, atttypmod = 36, attbyval = 0 '\000', attstorage = 112 'p',
attisset = 0 '\000', attalign = 105 'i', attnotnull = 0 '\000',
atthasdef = 0 '\000'}
(gdb) print i
$11 = 3
(gdb) print *(att[0])
$12 = {attrelid = 3518994475, attname = {
data = "counter_id", '\000' <repeats 21 times>,
alignmentDummy = 1853189987}, atttypid = 23,
attdisbursion = 0.000228356235, attlen = 4, attnum = 1, attnelems = 0,
attcacheoff = 0, atttypmod = -1, attbyval = 1 '\001', attstorage = 112 'p',
attisset = 0 '\000', attalign = 105 'i', attnotnull = 0 '\000',
atthasdef = 0 '\000'}
(gdb) print *(att[1])
$13 = {attrelid = 3518994475, attname = {
data = "attr_type", '\000' <repeats 22 times>,
alignmentDummy = 1920234593}, atttypid = 1043,
attdisbursion = 0.0928893909, attlen = -1, attnum = 2, attnelems = 0,
attcacheoff = 4, atttypmod = 36, attbyval = 0 '\000', attstorage = 112 'p',
attisset = 0 '\000', attalign = 105 'i', attnotnull = 0 '\000',
atthasdef = 0 '\000'}
(gdb) print *(att[2])
$14 = {attrelid = 3518994475, attname = {
data = "attr_name", '\000' <repeats 22 times>,
alignmentDummy = 1920234593}, atttypid = 1043,
attdisbursion = 0.370779663, attlen = -1, attnum = 3, attnelems = 0,
attcacheoff = -1, atttypmod = 36, attbyval = 0 '\000', attstorage = 112 'p',
attisset = 0 '\000', attalign = 105 'i', attnotnull = 0 '\000',
atthasdef = 0 '\000'}
(gdb) print attnum
$15 = 4
(gdb) print *(att[4])
$16 = {attrelid = 3518994475, attname = {
data = "attr_hits", '\000' <repeats 22 times>,
alignmentDummy = 1920234593}, atttypid = 20, attdisbursion = 0.0573871136,
attlen = 8, attnum = 5, attnelems = 0, attcacheoff = -1, atttypmod = -1,
attbyval = 0 '\000', attstorage = 112 'p', attisset = 0 '\000',
attalign = 100 'd', attnotnull = 0 '\000', atthasdef = 1 '\001'}

--------------------------------------------------

I'm pretty sure this is a pg_dump that died when the fist crash
happened above:

* $Header: /home/pgcvs/pgsql/src/backend/commands/copy.c,v 1.106.2.2 2000/06/28 06:13:01 tgl Exp $

Program terminated with signal 10, Bus error.
#0 0x482a7d95 in ?? (?? )
1 0x808c393 in CopyTo (rel=0x84e7890, binary=0 '\000', oids=0 '\000',
fp=0x0, delim=0x8159fa9 "\t", null_print=0x8159fab "\\N") at copy.c:508
#2 0x808bf99 in DoCopy (relname=0x84930e8 "", binary=0 '\000', oids=0 '\000',
from=0 '\000', pipe=1 '\001', filename=0x0, delim=0x8159fa9 "\t",
null_print=0x8159fab "\\N") at copy.c:374
#3 0x80f98a3 in ProcessUtility (parsetree=0x8493110, dest=Remote)
at utility.c:262
#4 0x80f7e5e in pg_exec_query_dest (query_string=0x81a9370 "", dest=Remote,
aclOverride=0) at postgres.c:617
#5 0x80f7db9 in pg_exec_query (query_string=0x81a9370 "") at postgres.c:562
#6 0x80f8d1a in PostgresMain (argc=9, argv=0xbfbff0bc, real_argc=10,
real_argv=0xbfbffb1c) at postgres.c:1590
#7 0x80e1d06 in DoBackend (port=0x843f000) at postmaster.c:2009
#8 0x80e1899 in BackendStartup (port=0x843f000) at postmaster.c:1776
#9 0x80e0abd in ServerLoop () at postmaster.c:1037
#10 0x80e04be in PostmasterMain (argc=10, argv=0xbfbffb1c) at postmaster.c:725
#11 0x80aee43 in main (argc=10, argv=0xbfbffb1c) at main.c:93
#12 0x80633c5 in _start ()
(gdb) up
#1 0x808c393 in CopyTo (rel=0x84e7890, binary=0 '\000', oids=0 '\000',
fp=0x0, delim=0x8159fa9 "\t", null_print=0x8159fab "\\N") at copy.c:508
508 string = (char *) (*fmgr_faddr(&out_functions[i]))
(gdb) print out_functions[i]
$1 = {fn_addr = 0, fn_plhandler = 0, fn_oid = 0, fn_nargs = 0}
(gdb) print i
$2 = 2
(gdb) print isnull
$3 = 0 '\000'
(gdb) print tupDesc
No symbol "tupDesc" in current context.
(gdb) print tuple
$4 = 0x8493268
(gdb) print *tuple
$5 = {t_len = 0, t_self = {ip_blkid = {bi_hi = 0, bi_lo = 0}, ip_posid = 0},
t_datamcxt = 0x0, t_data = 0x0}
(gdb) print value
$6 = 1072927316
(gdb) print *value
Cannot access memory at address 0x3ff39254.
(gdb) print oids
$7 = 0 '\000'
(gdb) print binary
$8 = 0 '\000'
(gdb) print string
$9 = 0xfffffffc <Address 0xfffffffc out of bounds>

Now I think I have the intial spot where it all goes to pot (the
initial traceback). I really appreciate the continued help and
pointers that I've been given and was wondering if someone could
help me out a bit more.

sorry for being such a pain and if any other info is needed please ask.

thanks for you time,
--
-Alfred Perlstein - [bright(at)wintelcom(dot)net|alfred(at)freebsd(dot)org]
"I have the heart of a child; I keep it in a jar on my desk."

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2000-10-03 03:21:28 Re: Proposal: TRUNCATE TABLE table RESTRICT
Previous Message Christof Petig 2000-10-02 21:54:46 Re: Suggested change in include/utils/elog.h