Re: Out of space situation and WAL log pre-allocation (was Tablespaces)

From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "'Joe Conway'" <mail(at)joeconway(dot)com>
Cc: "'Gavin Sherry'" <swm(at)linuxworld(dot)com(dot)au>, <tswan(at)idigx(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Out of space situation and WAL log pre-allocation (was Tablespaces)
Date: 2004-03-03 21:40:09
Message-ID: 006601c40168$1a838530$5baa87d9@LaptopDellXP
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
> Joe Conway <mail(at)joeconway(dot)com> writes:
> > Tom Lane wrote:
> >> Joe Conway <mail(at)joeconway(dot)com> writes:
> >>> Maybe specify an archive location (that of course could be on a
> separate
> >>> partition) that the external archiver should check in addition to
the
> >>> normal WAL location. At some predetermined interval, push WAL log
> >>> segments no longer needed to the archive location.
> >>
> >> Does that really help? The panic happens when you fill the
"normal"
> and
> >> "archive" partitions, how's that different from one partition?
>
> > I see your point. But it would allow you to use a relatively modest
> > local partition for WAL segments, while you might be using a 1TB
netapp
> > tray over NFS for the archive segments.
>
> Fair enough, but it seems to me that that sort of setup really falls
in
> the category of a user-defined archiving process --- that is, the hook
> that Postgres calls will push WAL segments from the local partition to
> the NFS server, and then pushing them off NFS to tape is the
> responsibility of some other user-defined subprocess. Database panic
> happens if and only if the local partition overflows. I don't see
that
> making Postgres explicitly aware of the secondary NFS arrangement will
> buy anything.

Tom's last sentence there summarises the design I was working with. I
had considered Joe's suggested approach (which was Oracle's also).

However, the PITR design will come with a usable low-function program
which can easily copy logs from pg_xlog to another archive directory.
That's needed as a test harness anyway, so it may as well be part of the
package. You'd be able to use that in production to copy xlogs to
another larger directory as a staging area to tape/failover on another
system: effectively Joe's idea is catered for in the basic package.

Anyway I'm answering questions before publishing the design as
stands...though people do keep spurring me to refine it as I'm writing
it down! That's why its good to document it I guess.

> > I guess if the archive partition fills up, I would err on the side
of
> > dropping archive segments on the floor.
>
> That should be user-scriptable policy, in my worldview.

Hmmm. Very difficult that one.

My experience is in commercial systems. Dropping archive segments on the
floor is just absolutely NOT GOOD, if that is the only behaviour. The
whole purpose of having a dbms is so that you can protect your business
data, while using it. Such behaviour would most likely be a barrier to
wider commercial adoption. [Oracle and other dbms will freeze when this
situation is hit, rather than continue and drop archive logs.]

User-selectable behaviour? OK. That's how we deal with fsync; I can
relate to that. That hadn't been part of my thinking because of the
importance I'd attached to the log files themselves, but I can go with
that, if that's what was meant.

So, if we had a parameter called Wal_archive_policy that has 3 settings:
None = no archiving
Optimistic = archive, but if for some reason log space runs out then
make space by dropping the oldest archive logs
Strict = if log space runs out, stop further write transactions from
committing, by whatever means, even if this takes down dbms.

That way, we've got something akin to transaction isolation level with
various levels of protection.

Best Regards, Simon Riggs

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bill Moran 2004-03-03 21:42:34 Shouldn't B'1' = 1::bit be true?
Previous Message Ken Hirsch 2004-03-03 21:30:37 Re: [pgsql-hackers-win32] What's left?