Re: [HACKERS] Problem after removal of exec(), help

From: dg(at)illustra(dot)com (David Gould)
To: maillist(at)candle(dot)pha(dot)pa(dot)us (Bruce Momjian)
Cc: hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] Problem after removal of exec(), help
Date: 1998-06-23 01:51:39
Message-ID: 9806230151.AA07582@hawk.illustra.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> Since the removal of exec(), Thomas has seen, and I have confirmed that
> if a backend crashes, and the postmaster must reset the shared memory,
> no backends can connect anymore. One way to reproduce it is to run the
> regression tests, which on their last test will crash for an un-related
> reason. However, it will not allow you to restart any more backends.
>
> The error it gets is:
>
> Failed Assertion("!((((unsigned long)nextElem) > ShmemBase)):", File: "shmqueue.
> c", Line: 83)
> !((((unsigned long)nextElem) > ShmemBase)) (0) [No such file or directory]
>
> In this case nextElem = ShmemBase, so it is not greater. Removing the
> Assert() still does not make things work, so there must be something
> else.
>
> Now, the problem is probably not at that exact spot, but somewhere
> deeper. There are two differences between the old non-exec() behavior
> and new behavior. In the old setup, the backend had all its global
> variables initialized, while in the new no-exec case, they take the
> global variable values from the postmaster. Second, the old setup had
> each backend attaching to the shared memory, while the new setup has
> them inheriting the shared memory from the fork().
>
> My guess is that there is something buggy about the reset code in
> postmaster.c that was not resetting completely, but the initialization
> of the global variables in the backend was masking the bug, or the
> attach() operation did some extra work that we now need to do when
> resetting the shared memory:
>
> static void
> reset_shared(short port)
> {
> ipc_key = port * 1000 + shmem_seq * 100;
> CreateSharedMemoryAndSemaphores(ipc_key);
> ActiveBackends = FALSE;
> shmem_seq += 1;
> if (shmem_seq >= 10)
> shmem_seq -= 10;
> }
>
>
> I am stumped on this.

No help here, but a request:

Could we have an option to do the fork()/exec() the old way as well as the
new sleek fork() only. I want to do some performance testing under gprof and
want to be able to replace my postgres binary with a shell script to save
the gmon.out file eg:

#!/bin/sh
postgres.bin $*
mv gmon.out gmon.$$

This won't work unless and exec() is done.

-dg

David Gould dg(at)illustra(dot)com 510.628.3783 or 510.305.9468
Informix Software (No, really) 300 Lakeside Drive Oakland, CA 94612
"Don't worry about people stealing your ideas. If your ideas are any
good, you'll have to ram them down people's throats." -- Howard Aiken

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 1998-06-23 04:16:52 Re: [HACKERS] Divide by zero error on SPARC/Linux.
Previous Message ocie 1998-06-22 23:24:39 Re: [HACKERS] SQL queries accessing tables in more than one db