Re: Idea for improving buildfarm robustness

From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Joe Conway <mail(at)joeconway(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Idea for improving buildfarm robustness
Date: 2015-09-30 06:50:22
Message-ID: 560B862E.3030001@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 9/29/15 4:13 PM, Alvaro Herrera wrote:
> Joe Conway wrote:
>> On 09/29/2015 01:48 PM, Alvaro Herrera wrote:
>
>>> I remember it, but I'm not sure it would have helped you. As I recall,
>>> your trouble was that after a reboot the init script decided to initdb
>>> the mount point -- postmaster wouldn't have been running at all ...
>>
>> Right, which the init script non longer does as far as I'm aware, so
>> hopefully will never happen again to anyone.
>
> Yeah.
>
>> But it was still a case where the postmaster started on one copy of
>> PGDATA (the newly init'd copy), and then the contents of the real PGDATA
>> was swapped in (when the filesystem was finally mounted), causing
>> corruption to the production data.
>
> Ah, I didn't remember that part of it, but it makes sense.

Ouch. So it sounds like there's value to seeing if pg_control isn't what
we expect it to be.

Instead of looking at the inode (portability problem), what if
pg_control contained a random number that was created at initdb time? On
startup postmaster would read that value and then if it ever changed
after that you'd know something just went wrong.

Perhaps even stronger would be to write a new random value on startup;
that way you'd know if an old copy accidentally got put in place.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2015-09-30 07:02:43 Re: No Issue Tracker - Say it Ain't So!
Previous Message Michael Paquier 2015-09-30 06:46:00 Re: Use pg_rewind when target timeline was switched