Re: Idea for improving buildfarm robustness

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Idea for improving buildfarm robustness
Date: 2015-09-29 22:41:33
Message-ID: 560B139D.4090208@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/29/2015 12:47 PM, Tom Lane wrote:
> Josh Berkus <josh(at)agliodbs(dot)com> writes:
>> In general, having the postmaster survive deletion of PGDATA is
>> suboptimal. In rare cases of having it survive installation of a new
>> PGDATA (via PITR restore, for example), I've even seen the zombie
>> postmaster corrupt the data files.
>
> However ... if you'd simply deleted everything *under* $PGDATA but not
> that directory itself, then this type of failure mode is 100% plausible.
> And that's not an unreasonable thing to do, especially if you've set
> things up so that $PGDATA's parent is not a writable directory.

I don't remember the exact setup, but this is likely the case. Probably
1/3 of the systems I monitor have a root-owned mount point for PGDATA's
parent directory.

> Testing accessibility of "global/pg_control" would be enough to catch this
> case, but only if we do it before you create a new one. So that seems
> like an argument for making the test relatively often. The once-a-minute
> option is sounding better and better.
>
> We could possibly add additional checks, like trying to verify that
> pg_control has the same inode number it used to. But I'm afraid that
> would add portability issues and false-positive hazards that would
> outweigh the value.

It's not worth doing extra stuff for this.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2015-09-29 22:46:11 Re: No Issue Tracker - Say it Ain't So!
Previous Message Josh Berkus 2015-09-29 22:39:06 Re: No Issue Tracker - Say it Ain't So!