Re: btree: BTP_CHAIN flag was expected (revisited)

From: The Hermit Hacker <scrappy(at)hub(dot)org>
To: David Schanen <mtv(at)ibm(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, prod_dev(at)natrindo(dot)co(dot)id, miker(at)scifair(dot)acadiau(dot)ca
Subject: Re: btree: BTP_CHAIN flag was expected (revisited)
Date: 1998-06-22 11:49:18
Message-ID: Pine.BSF.3.96.980622074732.16934L-100000@hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 22 Jun 1998, David Schanen wrote:

> Hi Marc & Mike,
>
> I wanted to check with you to see if you had seen my latest post to
> pgsql-questions?

pgsql-questions has been a discontinued mailing list for over a
month now, actually...and, from the topic, this should be discussed on
pgsql-hackers anyway :)

> Looking at the backtrace from the debug.core It seems to me like we are
> still getting the BTP_CHAIN errors we saw in previous versions.

You are using v6.3.2+patches currently?

>
> The cause seems to be a corruption in a single record of a btree index
> in very large tables(indices). If we simply restart the postgres
> backends and try to query on the same record where it crashed before we
> cause the crash again. If we query on any other record there is no
> problem. If we reindex the problem goes away. Unfortunately this is a
> high volume realtime telephony application and taking the system out of
> service for twenty minutes to reindex could cause the loss of too much
> data for thousands of calls and prevention of service to thousands more.
>
> I think the bug must be in writing of the index record or (more likely)
> an adjacent index record but I don't know how how to find it.
>
> Marc I have been reluctant to include Vadim in these emails so far. Do
> you think it is okay to bring him in on this one? I haven't had any
> response from the post to the list.
>
> Mike have you tried compiling with debug?
>
> Below is the backtrace output from my debug.core. I see the BTP_CHAIN
> error, am I missing something else that you can see?
>
> Thanks for your help!
>
> Best regards,
>
> -Dave
>
> PS. Marc, I haven't heard back from Mike since my earliest email. Have you heard
> anything fom him?
>
> David Schanen wrote:
>
> > Maybe we are still having btree problems, but we no longer see the BTP_CHAIN
> > error. Now we get:
> >
> > > IpcMemoryCreate: memKey=5432101 , size=2361552 ,
> > > permission=384IpcMemoryCreate: shmget(..., create, ...) failed:
> > > Cannot allocate memory
> >
> > Here is the backtrace output. Let me know if you need the core file.
> >
> > Thanks,
> >
> > -Dave
> > ----------------
> > Postgres 6.3.2
> > Pentium II / 200 - 128M
> > FreeBSD 3.0-981007-SNAP
> >
> > # gdb postgres postgres.core.save
> > GDB is free software and you are welcome to distribute copies of it
> > under certain conditions; type "show copying" to see the conditions.
> > There is absolutely no warranty for GDB; type "show warranty" for details.
> > GDB 4.16 (i386-unknown-freebsd),
> > Copyright 1996 Free Software Foundation, Inc...
> > Core was generated by `postgres'.
> > Program terminated with signal 11, Segmentation fault.
> > Cannot access memory at address 0x40ff080.
> > #0 0x41a256a in ?? ()
> > (gdb) bt
> > #0 0x41a256a in ?? ()
> > #1 0x41b7060 in ?? ()
> > #2 0x415b5e5 in ?? ()
> > #3 0xc7578 in elog (lev=1,
> > fmt=0x1381e "btree: BTP_CHAIN flag was expected in %s (access = %s)")
> > at elog.c:121
> > #4 0x1397f in _bt_moveright (rel=0x225290, buf=153, keysz=1,
> > scankey=0x21b3d0, access=0) at nbtsearch.c:222
> > #5 0x137e9 in _bt_searchr (rel=0x225290, keysz=1, scankey=0x21b3d0,
> > bufP=0xefbfb664, stack_in=0x2405f0) at nbtsearch.c:127
> > #6 0x136e7 in _bt_search (rel=0x225290, keysz=1, scankey=0x21b3d0,
> > bufP=0xefbfb664) at nbtsearch.c:55
> > #7 0x1014e in _bt_doinsert (rel=0x225290, btitem=0x21b390, index_is_unique=0,
> > heapRel=0x21dd90) at nbtinsert.c:63
> > #8 0x12f84 in btinsert (rel=0x225290, datum=0x2405b0, nulls=0x2405d0 " \002",
> > ht_ctid=0x1dd228, heapRel=0x21dd90) at nbtree.c:377
> > #9 0xc8445 in fmgr_c (finfo=0xefbfb6f4, values=0xefbfb704,
> > isNull=0xefbfb6f3 "") at fmgr.c:119
> > #10 0xc8834 in fmgr (procedureId=331) at fmgr.c:290
> > #11 0xc6d5 in index_insert (relation=0x225290, datum=0x2405b0,
> > nulls=0x2405d0 " \002", heap_t_ctid=0x1dd228, heapRel=0x21dd90)
> > at indexam.c:180
> > #12 0x3a178 in ExecInsertIndexTuples (slot=0x1cbc10, tupleid=0x1dd228,
> > estate=0x1d8310, is_update=0) at execUtils.c:1156
> > #13 0x36fa9 in ExecAppend (slot=0x1cbc10, tupleid=0x0, estate=0x1d8310)
> > at execMain.c:1010
> > #14 0x36dfe in ExecutePlan (estate=0x1d8310, plan=0x1d8210,
> > parseTree=0x225910, operation=CMD_INSERT, numberTuples=0,
> > direction=ForwardScanDirection, printfunc=0x3520 <printtup>)
> > at execMain.c:814
> > #15 0x36751 in ExecutorRun (queryDesc=0x230f50, estate=0x1d8310, feature=3,
> > count=0) at execMain.c:236
> > #16 0xa01db in ProcessQueryDesc (queryDesc=0x230f50) at pquery.c:332
> > #17 0xa0246 in ProcessQuery (parsetree=0x225910, plan=0x1d8210, argv=0x0,
> > typev=0x0, nargs=0, dest=Remote) at pquery.c:378
> > #18 0x9e3dd in pg_exec_query_dest (
> > query_string=0xefbfb934 "insert into acct_history (acct_no, activity_date,
> > origination, destination, duration, amount, balance, changed_by, changed_on)
> > VALUES ( '126587291393', 'Wed Jun 17 18:38:06 1998', '0213906996', '79028"...,
> > argv=0x0, typev=0x0, nargs=0, dest=Remote) at postgres.c:699
> > #19 0x9e290 in pg_exec_query (
> > query_string=0xefbfb934 "insert into acct_history (acct_no, activity_date,
> > origination, destination, duration, amount, balance, changed_by, changed_on)
> > VALUES ( '126587291393', 'Wed Jun 17 18:38:06 1998', '0213906996', '79028"...,
> > argv=0x0, typev=0x0, nargs=0) at postgres.c:601
> > #20 0x9fa31 in PostgresMain (argc=9, argv=0xefbfd978) at postgres.c:1382
> > #21 0x49bfa in main (argc=9, argv=0xefbfd978) at main.c:106
> > (gdb)
> >
> > The Hermit Hacker wrote:
> >
> > > On Mon, 8 Jun 1998, David Schanen wrote:
> > >
> > > > a) I compiled 6.3.2 with CASSERT as recommended by vadim in one of
> > > > his posts. What does this do for me exactly? Could this be the reason
> > > > we aren't seeing the error report any longer?
> > >
> > > CASSERT shouldn't be used in production, only in development...can
> > > you send in a trace of what the core shows?
> > >
> > > > b) Can someone explain what causes the BTP_CHAIN error above?
> > >
> > > all I know is that its an index corruption only fixed by dropping
> > > and recreating the index. v6.3.2 tells you which table is generating the
> > > BTP_CHAIN error as part of its error message...
> > >
> > > > b) How dangerous do you think it is to continue to run the database
> > > > in this condition?
> > >
> > > My experience: the index is useless when the condition is
> > > triggered...
> > >
> > > Marc G. Fournier
> > > Systems Administrator @ hub.org
> > > primary: scrappy(at)hub(dot)org secondary: scrappy(at){freebsd|postgresql}.org
>
>
>

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 1998-06-22 14:45:25 Problem after removal of exec(), help
Previous Message Alexzander Blashko 1998-06-22 07:12:22 Re: [HACKERS] cursors and other dragons