BOUNCE pgsql-bugs@postgreSQL.org: Non-member submission from [Brent Ewing <bge@hoh.genome.washington.edu>]

From: owner-bugs(at)postgreSQL(dot)org
To: owner-bugs(at)postgreSQL(dot)org, bge(at)hoh(dot)genome(dot)washington(dot)edu
Subject: BOUNCE pgsql-bugs@postgreSQL.org: Non-member submission from [Brent Ewing <bge@hoh.genome.washington.edu>]
Date: 1999-10-02 23:21:38
Message-ID: 199910022321.TAA28559@hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

>From scrappy(at)postgresql(dot)org Sat Oct 2 19:19:16 1999
Received: from hoh.genome.washington.edu (hoh.genome.washington.edu [128.95.73.94])
by hub.org (8.9.3/8.9.3) with ESMTP id TAA28166
for <pgsql-bugs(at)postgresql(dot)org>; Sat, 2 Oct 1999 19:18:45 -0400 (EDT)
(envelope-from bge(at)hoh(dot)genome(dot)washington(dot)edu)
Received: (from bge(at)localhost)
by hoh.genome.washington.edu (8.9.0/8.9.0) id QAA08813
for pgsql-bugs(at)postgresql(dot)org; Sat, 2 Oct 1999 16:18:27 -0700 (PDT)
Date: Sat, 2 Oct 1999 16:18:27 -0700 (PDT)
From: Brent Ewing <bge(at)hoh(dot)genome(dot)washington(dot)edu>
Message-Id: <199910022318(dot)QAA08813(at)hoh(dot)genome(dot)washington(dot)edu>
To: pgsql-bugs(at)postgresql(dot)org
Subject: possible bug

If PostgreSQL failed to compile on your computer or you found a bug that
is likely to be specific to one platform then please fill out this form
and e-mail it to pgsql-ports(at)postgresql(dot)org(dot)

To report any other bug, fill out the form below and e-mail it to
pgsql-bugs(at)postgresql(dot)org(dot)

If you not only found the problem but solved it and generated a patch
then e-mail it to pgsql-patches(at)postgresql(dot)org instead. Please use the
command "diff -c" to generate the patch.

You may also enter a bug report at http://www.postgresql.org/ instead of
e-mail-ing this form.

============================================================================
POSTGRESQL BUG REPORT TEMPLATE
============================================================================

Your name : Brent Ewing
Your email address : bge(at)u(dot)washington(dot)edu

System Configuration
---------------------
Architecture (example: Intel Pentium) : DEC Alpha

Operating System (example: Linux 2.0.26 ELF) : Digital UNIX 4.0D

PostgreSQL version (example: PostgreSQL-6.5.2): PostgreSQL-6.5.2

Compiler used (example: gcc 2.8.0) : mostly cc

Please enter a FULL description of your problem:
------------------------------------------------

In short, the backend crashes while trying to create certain indexes on a
table. I added some diagnostics in the modules nbtsort.c and bufpage.c
where it's dying. The backend output is

------

hoh> postmaster -d
FindExec: found "/usr/local/pgsql/bin/postgres" using argv[0]

/usr/local/pgsql/bin/postmaster: BackendStartup: pid 26070 user bge db est_db socket 6
FindExec: found "/usr/local/pgsql/bin/postgres" using argv[0]
started: host=localhost user=bge database=est_db
InitPostgres
StartTransactionCommand
ProcessQuery
CommitTransactionCommand
StartTransactionCommand
ProcessUtility

PageAddItem: lower > upper: lower: 920 upper: 912: alignedSize: 32 pageManagerShuffle: 1 shuffled: 1 sizeof_itemiddata: 4 pd_lower: 916 pd_upper: 944 offsetNumber: 228 limit: 228
_bt_buildadd: alloc flag: 1 pgspc_old: 0 btisz_old: 32 PageGetFreeSpace: 24
FATAL 1: btree: failed to add item to the page in _bt_sort (2)
proc_exit(0) [#0]
shmem_exit(0) [#0]
exit(0)
/usr/local/pgsql/bin/postmaster: reaping dead processes...
/usr/local/pgsql/bin/postmaster: CleanupProc: pid 26070 exited with status 0

------

The problem occurs (is detected) in the function PageAddItem() (in bufpage.c)
in the block that now looks like

------

if (offsetNumber > limit)
lower = (Offset) (((char *) (&((PageHeader) page)->pd_linp[offsetNumber])) - ((char *) page));
else if (offsetNumber == limit || shuffled == true)
lower = ((PageHeader) page)->pd_lower + sizeof(ItemIdData);
else
lower = ((PageHeader) page)->pd_lower;

alignedSize = DOUBLEALIGN(size);

upper = ((PageHeader) page)->pd_upper - alignedSize;

if (lower > upper)
{
fprintf( stderr, "PageAddItem: lower > upper: lower: %d upper: %d: alignedSize: %d pageManagerShuffle: %d shuffled: %d sizeof_itemidd
ata: %d pd_lower: %d pd_upper: %d offsetNumber: %d limit: %d\n",
(int)lower, (int)upper, (int)alignedSize, (int)PageManagerShuffle, shuffled, sizeof( ItemIdData ),
(int)((PageHeader) page)->pd_lower, (int)((PageHeader) page)->pd_upper, offsetNumber, limit );
return InvalidOffsetNumber;
}

------

The problem is that lower > upper!

The bit of output from the calling function, _bt_buildadd in nbtsort.c, shows
the values of pgspc and btisz near the start of the function. The code and my
additions at this point are

------

nbuf = state->btps_buf;
npage = state->btps_page;
first_off = state->btps_firstoff;
last_off = state->btps_lastoff;
last_bti = state->btps_lastbti;

pgspc = PageGetFreeSpace(npage);
btisz = BTITEMSZ(bti);
btisz = MAXALIGN(btisz);
if (pgspc < btisz)
{
Buffer obuf = nbuf;
Page opage = npage;
OffsetNumber o,
n;
ItemId ii;
ItemId hii;

pgspc_sav = pgspc;
btisz_sav = btisz;

_bt_blnewpage(index, &nbuf, &npage, flags);

alloc_spc_flag = 1;

------

nd the code and my additions at the point where PageAddItem is called
and returns failure looks like

------

/*
* if this item is different from the last item added, we start a new
* chain of duplicates.
*/
off = OffsetNumberNext(last_off);
if (PageAddItem(npage, (Item) bti, btisz, off, LP_USED) == InvalidOffsetNumber)
{
fprintf( stderr, "_bt_buildadd: alloc flag: %d pgspc_old: %d btisz_old: %d PageGetFreeSpace: %d\n", alloc_spc_flag, pgspc_sav, btisz_s
av, (int)PageGetFreeSpace(npage) );
elog(FATAL, "btree: failed to add item to the page in _bt_sort (2)");
}
#ifdef NOT_USED
#if defined(FASTBUILD_DEBUG) && defined(FASTBUILD_MERGE)
{
bool isnull;
Datum d = index_getattr(&(bti->bti_itup), 1, index->rd_att, &isnull);

printf("_bt_buildadd: inserted <%x> at offset %d at level %d\n",
d, off, state->btps_level);
}
#endif /* FASTBUILD_DEBUG && FASTBUILD_MERGE */
#endif

------

Incidentally, I vacuumed several times, without affecting the outcome.
Also, the problem surfaced as I ran PG v6.5. I subsequently installed
v6.5.2 without modifying the database, and tried again with the same
result.

Please describe a way to repeat the problem. Please try to provide a
concise reproducible example, if at all possible:
----------------------------------------------------------------------

I can repeat this on my data set, perfectly consistently. If this is
really a bug, I can add diagnostic code where ever you would like it
added. (The data set is over a Gbyte so it is not easily sent.)

If you know how this problem might be fixed, list the solution below:
---------------------------------------------------------------------

Browse pgsql-bugs by date

  From Date Subject
Next Message Brent Ewing 1999-10-04 16:48:56 possible bug
Previous Message Brent Ewing 1999-10-02 23:18:27 possible bug