Re: NOT LOGGED options (was Point in Time Recovery )

From: "Simon(at)2ndquadrant(dot)com" <simon(at)2ndquadrant(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Bruce Momjian" <pgman(at)candle(dot)pha(dot)pa(dot)us>, "Klaus Naumann" <kn(at)mgnet(dot)de>, <markw(at)osdl(dot)org>, "Pgsql-Hackers(at)Postgresql(dot) Org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: NOT LOGGED options (was Point in Time Recovery )
Date: 2004-08-14 09:39:23
Message-ID: NOEFLCFHBPDAFHEIPGBOAEHPCCAA.simon@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-hackers pgsql-patches

> Tom Lane wrote
> "Simon(at)2ndquadrant(dot)com" <simon(at)2ndquadrant(dot)com> writes:
> > It would be my intention (in 8.1) to make those available via
> switches e.g.
> > NOT LOGGED options on CREATE INDEX and COPY, to allow users to take
> > advantage of the no logging optimization without turning off PITR system
> > wide. (Just as this is possible in Oracle and Teradata).
>
> Isn't this in direct conflict with your opinion above? And I cannot say
> that I think this one is a good idea. We do not have support for
> selective catalog xlogging; if you do something like this then you
> *will* have a broken database after recovery, because it will contain
> those indexes but with invalid contents.

No, its not in direct conflict. Turning OFF archive_mode would have a system
wide effect. The options described allow individual applications to make a
choice about whether certain very large operations are recoverable, or not.
I don't ever personally want to turn off system wide PITR, but there will be
times when I choose to avoid overhead on individual ops when the situation
dictates. This goes with your oft-mentioned dislike of systems that think
they know better than you do...

The first two optimizations have been included in 8.0 when archive_mode is
off. If there is a problem, then it will effect crash recovery of those
systems also. I suggest using exactly this optimisation, though under user
(application) control, rather than sysadmin control.

The challenges you mention have a solution. I wanted to add these to TODO,
not yet to discuss detailed implementation.

> > I would also aim to make the first Insert Select into an empty table not
> > logged (optionally). This is an important optimization for
> Oracle, teradata
> > and DB2 (which uses NOT LOGGED INITIALLY).
>
> This is even worse: not only do you have a broken database, but you have
> no way to recover. (At least with an unlogged index you could fix it by
> REINDEX.) If you don't care about longevity of the table, then make it
> a temp table.
>

It is frequently possible to use that route, though the option remains in
frequent use in other situations.

> The fact that Oracle does it does not automatically make it a good idea.
>

Amen to that. You will note that unless compatability has been a
requirement, there have been times I have not followed the Oracle path, e.g.
PITR design.

I admit it must seem strange that I tried so hard to put PITR in place, only
to suggest removing it, optionally...

Overall, the options I describe here have been in production use in major
enterprise Data Warehouse systems for almost 15 years now. Oracle and DB2
copied the original Teradata implementation; slowly because, they too,
didn't quickly or easily accept the wisdom. There is abosultely no doubt of
the true value of these optimisations - the TPC-H tests for all vendors make
use of those (hidden in the details of which load utility options are used,
or simply the default behaviour).

Logging only has value when the mean time to recover is low enough to make
recovery worthwhile. This can catch you in a bind because you have to decide
whether to reduce MTTR at the expense of 100% data recovery. For some big
systems, recovery is only an option if you exclude the biggest table(s). In
a Data Warehouse, where data is loaded in large volumes, it may only be
feasible to load it when you have this optimisation. In a recovery
situation, re-loading the largest fact tables from their original source
data files is more likely to be the best option, or in some cases, skipped
entirely in favour of loading new data.

I don't claim that everybody would want this, only that it is an extremely
beneficial optimisation for many very large databases - which is much of my
focus.

You've pointed out that I'm new "round here", which is certainly true - but
I have been many places... There are and will be many differences in
thinking that emerge from this; I regard all of this as synergy, not
argument.

Best Regards, Simon Riggs

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Goulet, Dick 2004-08-14 14:19:23 Re: Pseudo-Off-topic-survey: Opinions about future of Postgresql(MySQL)?
Previous Message Gregory S. Williamson 2004-08-14 06:55:57 Re: Pseudo-Off-topic-survey: Opinions about future ofPostgresql(MySQL)?

Browse pgsql-hackers by date

  From Date Subject
Next Message Gaetano Mendola 2004-08-14 10:55:16 8.0beta1 hot spare how to
Previous Message Peter Eisentraut 2004-08-14 09:00:00 Re: Why is WIN 1250 client only?

Browse pgsql-patches by date

  From Date Subject
Next Message Christopher Kings-Lynne 2004-08-14 09:50:49 Re: to_timestamp overloaded to convert from Unix epoch
Previous Message Serguei Mokhov 2004-08-14 06:20:05 Translation updates for 8.0: initdb-ru, psql-ru