| From: | Amir Rohan <amir(dot)rohan(at)mail(dot)com> |
|---|---|
| To: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
| Cc: | pgsql-bugs(at)postgresql(dot)org |
| Subject: | Re: BUG #13643: Should a process dying bring postgresql down, or not? |
| Date: | 2015-09-28 21:42:00 |
| Message-ID: | 5609B428.6020006@mail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-bugs |
On 09/28/2015 12:06 AM, Alvaro Herrera wrote:
> Amir Rohan wrote:
>> On 09/27/2015 09:59 PM, Alvaro Herrera wrote:
>>> amir(dot)rohan(at)mail(dot)com wrote:
>>>
>>>> postgres 2181 0.0 0.1 134468 9504 pts/0 T 03:34 0:00 /usr/local/pgsql/bin/postgres -D /home/local/pg/s1
>>>> postgres 2183 0.0 0.0 134576 4168 ? Ss 03:34 0:00 postgres: checkpointer process
>>>> postgres 2184 0.0 0.0 134604 2844 ? Ss 03:34 0:00 postgres: writer process
>>>> postgres 2185 0.0 0.0 134468 2780 ? Ss 03:34 0:00 postgres: wal writer process
>>>> postgres 2186 0.0 0.0 0 0 ? Zs 03:34 0:00 [postgres] <defunct> <<<<<<<<<<<<<<< dead process
>>>> postgres 2187 0.0 0.0 127300 2204 ? Ss 03:34 0:00 postgres: stats collector process
>>>> postgres 2193 0.0 0.0 118164 2696 pts/0 T 03:34 0:00 pg_basebackup -D /home/local/pg/backup -p 57833 --format=t -x
>>>> postgres 2194 0.0 0.0 134916 6016 ? Ss 03:34 0:00 postgres: wal sender process user1 [local] sending backup "pg_basebackup base backup"
>>>
>>> That postmaster is in STOPped mode is the issue here. That doesn't
>>> happen unless you take specific action to do that.
>>
>> I hadn't noticed that. That looks like I suspended pg_ctl during start,
>> but with the backup in progress already, it's not clear how I managed
>> that state. There was no kill -SIGSTOP involved...
>
> Suspending a process *is* sending sigstop. You may not have sent
> sigstop explicitely, but the shell would have done it if you suspended
> the process.
>
> Since pg_ctl is not normally long-lived, I'm not sure how you ended up
> suspending it.
>
>> After killing some subprocesses in random I do see postgres
>> restarting the whole group once one goes down, if/once its
>> running/unsuspended.
>
> Well, doing things randomly is unlikely to teach you much ...
>
Pardon my earlier HTML response, I had to use the webmail interface at
the time. Sending again as text.
>
>
> Sent: Monday, September 28, 2015 at 12:06 AM
> From: "Alvaro Herrera" <alvherre(at)2ndquadrant(dot)com>
> To: "Amir Rohan" <amir(dot)rohan(at)mail(dot)com>
> Cc: pgsql-bugs(at)postgresql(dot)org
> Subject: Re: BUG #13643: Should a process dying bring postgresql down,
or not?
> Amir Rohan wrote:
>> On 09/27/2015 09:59 PM, Alvaro Herrera wrote:
>> > amir(dot)rohan(at)mail(dot)com wrote:
>> >
>> >> postgres 2181 0.0 0.1 134468 9504 pts/0 T 03:34 0:00
/usr/local/pgsql/bin/postgres -D /home/local/pg/s1
>> >> postgres 2183 0.0 0.0 134576 4168 ? Ss 03:34 0:00 postgres:
checkpointer process
>> >> postgres 2184 0.0 0.0 134604 2844 ? Ss 03:34 0:00 postgres: writer
process
>> >> postgres 2185 0.0 0.0 134468 2780 ? Ss 03:34 0:00 postgres: wal
writer process
>> >> postgres 2186 0.0 0.0 0 0 ? Zs 03:34 0:00 [postgres] <defunct>
<<<<<<<<<<<<<<< dead process
>> >> postgres 2187 0.0 0.0 127300 2204 ? Ss 03:34 0:00 postgres: stats
collector process
>> >> postgres 2193 0.0 0.0 118164 2696 pts/0 T 03:34 0:00 pg_basebackup
-D /home/local/pg/backup -p 57833 --format=t -x
>> >> postgres 2194 0.0 0.0 134916 6016 ? Ss 03:34 0:00 postgres: wal
sender process user1 [local] sending backup "pg_basebackup base backup"
>> >
>> > That postmaster is in STOPped mode is the issue here. That doesn't
>> > happen unless you take specific action to do that.
>>
>> I hadn't noticed that. That looks like I suspended pg_ctl during start,
>> but with the backup in progress already, it's not clear how I managed
>> that state. There was no kill -SIGSTOP involved...
>
> Suspending a process *is* sending sigstop. You may not have sent
> sigstop explicitely, but the shell would have done it if you suspended
> the process.
>
I *know*. But as you can see that backup process is already underway.
That means pg_ctl had returned by then, and I had issued the
pg_basebackup command. Since I didn't manually send a SIGSTOP,
and postgres was already detached by then, I don't know how it
could have gotten suspended.
> Since pg_ctl is not normally long-lived, I'm not sure how you ended up
> suspending it.
>
exactly.
>> After killing some subprocesses in random I do see postgres
>> restarting the whole group once one goes down, if/once its
>> running/unsuspended.
>
> Well, doing things randomly is unlikely to teach you much ...
>
Well, It can teach you which electric socket will
electrocute you when poked with a fork. That's useful data.
Amir
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Alvaro Herrera | 2015-09-28 21:53:57 | Re: BUG #13643: Should a process dying bring postgresql down, or not? |
| Previous Message | Jeremy Whiting | 2015-09-28 18:51:56 | Re: BUG #13646: Upgrading existing db from 9.2 to 9.4.4 not working using postgresql-setup. |