RE: [HACKERS] Open 6.5 items

From: "Hiroshi Inoue" <Inoue(at)tpf(dot)co(dot)jp>
To: "Vadim Mikheev" <vadim(at)krs(dot)ru>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <t-ishii(at)sra(dot)co(dot)jp>, "PostgreSQL-development" <pgsql-hackers(at)postgreSQL(dot)org>
Subject: RE: [HACKERS] Open 6.5 items
Date: 1999-05-31 00:33:25
Message-ID: 000e01beaafd$31ee4820$2801007e@cadzone.tpf.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello all,

> -----Original Message-----
> From: owner-pgsql-hackers(at)postgreSQL(dot)org
> [mailto:owner-pgsql-hackers(at)postgreSQL(dot)org]On Behalf Of Vadim Mikheev
> Sent: Saturday, May 29, 1999 2:51 PM
> To: Tom Lane
> Cc: t-ishii(at)sra(dot)co(dot)jp; PostgreSQL-development
> Subject: Re: [HACKERS] Open 6.5 items
>
>
> Tom Lane wrote:
> >
> > Vadim Mikheev <vadim(at)krs(dot)ru> writes:
> > >> If I recall the dynahash.c code correctly, a null return value
> > >> indicates either damage to the structure of the table (ie someone
> > >> stomped on memory that didn't belong to them) or running out
> of memory
> > >> to add entries to the table. The latter should be impossible if we
> >
> > > Quite different cases and should result in different reactions.
> >
> > I agree; will see about cleaning up hash_search's call convention after
> > 6.5 is done. Actually, maybe I should do it now? I'm not convinced yet
> > whether the reports we're seeing are due to memory clobber or running
> > out of space... fixing this may be the easiest way to find out.
>
> Imho, we have to fix it in some way before 6.5
> Either by changing dynahash.c (to return 0x1 if table is
> corrupted and 0x0 if out of space) or by changing
> elog(NOTICE) to elog(ERROR).
>

Another case exists which causes stuck spinlock abort.

status = WaitOnLock(lockmethod, lock, lockmode);

/*
* Check the xid entry status, in case something in the ipc
* communication doesn't work correctly.
*/
if (!((result->nHolding > 0) && (result->holders[lockmode] >
0))
)
{
XID_PRINT_AUX("LockAcquire: INCONSISTENT ", result);
LOCK_PRINT_AUX("LockAcquire: INCONSISTENT ", lock,
lockm
ode);
/* Should we retry ? */
return FALSE;

This case returns without releasing LockMgrLock and doesn't call even
elog().
As far as I see,different entries in xidHash have a same key when above
case occurs. Moreover xidHash has been in abnormal state since the
number of xidHash entries exceeded 256.

Is this bug solved by change maxBackends->NLOCKENTS(maxBackends)
by Vadim or the change about hash by Tom ?

As for my test case,xidHash is filled with XactLockTable entries which have
been acquired by XactLockTableWait().
Could those entries be released immediately after they are acquired ?

Thanks.

Hiroshi Inoue
Inoue(at)tpf(dot)co(dot)jp

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message D'Arcy J.M. Cain 1999-05-31 01:25:51 Re: [HACKERS] History of PostgreSQL
Previous Message The Hermit Hacker 1999-05-31 00:24:19 Re: [HACKERS] History of PostgreSQL