Re: [HACKERS] Problem after removal of exec(), help

From: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
To: maillist(at)candle(dot)pha(dot)pa(dot)us (Bruce Momjian)
Cc: hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] Problem after removal of exec(), help
Date: 1998-06-27 05:14:38
Message-ID: 199806270514.BAA27684@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> Since the removal of exec(), Thomas has seen, and I have confirmed that
> if a backend crashes, and the postmaster must reset the shared memory,
> no backends can connect anymore. One way to reproduce it is to run the
> regression tests, which on their last test will crash for an un-related
> reason. However, it will not allow you to restart any more backends.
>
> The error it gets is:
>
> Failed Assertion("!((((unsigned long)nextElem) > ShmemBase)):", File: "shmqueue.
> c", Line: 83)
> !((((unsigned long)nextElem) > ShmemBase)) (0) [No such file or directory]
>
> In this case nextElem = ShmemBase, so it is not greater. Removing the
> Assert() still does not make things work, so there must be something
> else.
>
> Now, the problem is probably not at that exact spot, but somewhere
> deeper. There are two differences between the old non-exec() behavior
> and new behavior. In the old setup, the backend had all its global
> variables initialized, while in the new no-exec case, they take the
> global variable values from the postmaster. Second, the old setup had
> each backend attaching to the shared memory, while the new setup has
> them inheriting the shared memory from the fork().

I have fixed the problem. The problem was that InitMultiLevelLocks()
was not re-initializing the LockTable, which was still pointing to the
old shared memory lock structures, not the new ones in the new shared
memory segment.

I had to change InitMultiLevelLocks so it always reset the memory, and
force LockTableInit to set Numtables in lock.c to 1 on startup, so it
re-creates the LOCKTAB entries that do not point to the old shared
memory stuff.

I also replaces on_exitpg with new on_proc_exit and on_shmem_exit() to
clarify when these are being run, and removed quasi_exit().

--
Bruce Momjian | 830 Blythe Avenue
maillist(at)candle(dot)pha(dot)pa(dot)us | Drexel Hill, Pennsylvania 19026
+ If your life is a hard drive, | (610) 353-9879(w)
+ Christ can be your backup. | (610) 853-3000(h)

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message James Werwath 1998-06-27 06:38:32 URL type ?
Previous Message Bruce Momjian 1998-06-27 02:11:56 Re: [SQL] isnull function]