Re: Idea for improving buildfarm robustness

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Joe Conway <mail(at)joeconway(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Idea for improving buildfarm robustness
Date: 2015-10-01 22:36:06
Message-ID: 9063.1443738966@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> It strikes me that a different approach that might be of value would
> be to re-read postmaster.pid and make sure that (a) it's still there
> and (b) it still contains the current postmaster's PID. This would
> be morally equivalent to what Jim suggests above, and it would dodge
> the checkpointer-destroys-the-evidence problem, and it would have the
> additional advantage that we'd notice when a brain-dead DBA decides
> to manually remove postmaster.pid so he can start a new postmaster.
> (It's probably far too late to avoid data corruption at that point,
> but better late than never.)

> This is still not bulletproof against all overwrite-with-a-backup
> scenarios, but it seems like a noticeable improvement over what we
> discussed yesterday.

Here's a rewritten patch that looks at postmaster.pid instead of
pg_control. It should be effectively the same as the prior patch in terms
of response to directory-removal cases, and it should also catch many
overwrite cases.

regards, tom lane

Attachment Content-Type Size
die-on-no-pgdata-2.patch text/x-diff 6.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2015-10-01 23:13:12 Re: ON CONFLICT issues around whole row vars,
Previous Message Josh Berkus 2015-10-01 22:30:25 Re: Freeze avoidance of very large table.