Re: Hard limit on WAL space used (because PANIC sucks)

From: Daniel Farina <daniel(at)heroku(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hard limit on WAL space used (because PANIC sucks)
Date: 2013-06-08 02:57:17
Message-ID: CAAZKuFZVtP6n9HjeQbVvBJQUmpp8ZNVkHxqza-sOmk0h4SUgRw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 7, 2013 at 12:14 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> Right now, what we're telling users is "You can have continuous backup
> with Postgres, but you'd better hire and expensive consultant to set it
> up for you, or use this external tool of dubious provenance which
> there's no packages for, or you might accidentally cause your database
> to shut down in the middle of the night."

Inverted and just as well supported: "if you want to not accidentally
lose data, you better hire an expensive consultant to check your
systems for all sorts of default 'safety = off' features." This
being but the hypothetical first one.

Furthermore, I see no reason why high quality external archiving
software cannot exist. Maybe some even exists already, and no doubt
they can be improved and the contract with Postgres enriched to that
purpose.

Contrast: JSON, where the stable OID in the core distribution helps
pragmatically punt on a particularly sticky problem (extension
dependencies and non-system OIDs), I can't think of a reason an
external archiver couldn't do its job well right now.

> At which point most sensible users say "no thanks, I'll use something else".

Oh, I lost some disks, well, no big deal, I'll use the archives. Surprise!

<forensic analysis ensues>

So, as it turns out, it has been dropping segments at times because of
systemic backlog for months/years.

Alternative ending:

Hey, I restored the database.

<later> Why is the state so old? Why are customers getting warnings
that their (thought paid) invoices are overdue? Oh crap, the restore
was cut short by this stupid option and this database lives in the
past!

Fin.

I have a clear bias in experience here, but I can't relate to someone
who sets up archives but is totally okay losing a segment unceremoniously,
because it only takes one of those once in a while to make a really,
really bad day. Who is this person that lackadaisically archives, and
are they just fooling themselves? And where are these archivers that
enjoy even a modicum of long-term success that are not reliable? If
one wants to casually drop archives, how is someone going to find out
and freak out a bit? Per experience, logs are pretty clearly
hazardous to the purpose.

Basically, I think the default that opts one into danger is not good,
especially since the system is starting from a position of "do too
much stuff and you'll crash."

Finally, it's not that hard to teach any archiver how to no-op at
user-peril, or perhaps Postgres can learn a way to do this expressly
to standardize the procedure a bit to ease publicly shared recipes, perhaps.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavan Deolasee 2013-06-08 08:06:46 Re: Possible bug in cascaded standby
Previous Message Tom Lane 2013-06-08 01:52:00 Re: Last log line for log_temp_files is disassociated with query