Re: Performance and WAL on big inserts/updates

From: Marty Scholes <marty(at)outputservices(dot)com>
To: sailesh(at)cs(dot)berkeley(dot)edu
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Performance and WAL on big inserts/updates
Date: 2004-03-12 03:38:12
Message-ID: 405130A4.5020106@outputservices.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> The point is that with redo logging, you can just blindly apply the
> log to the data pages in question, without even really restarting the
> database.

I also am not a recovery expert, but I have watched it happen more than
once.

You bring up a good point. My (perhaps false) understanding with
recovery/redo was that the last checkpointed state was recreated, then
log changes that finished with a commit were applied and the rest discarded.

Actually, this approach would not be ideal, since the last checkpointed
state would be unavailable if any data file writes took place between
the checkpoint and the crash.

Further, post-checkpoint log entries would surely contain multiple
copies of the same data block, and there is no point in applying any but
the last log entry for a particular block.

Ok. To recap:

* Logging parsed statements don't apply to light updates and most
installations live mostly on light updates
* Logging parsed statements would not work if the checkpoint state could
not be recreated, which is likely the case.

So, this soluition won't work, and even if it did, would not apply to
the vast majority of users.

Hmmm... No wonder it hasn't been implemented yet. ;-)

Thanks again,
Marty

Sailesh Krishnamurthy wrote:
> (Just a note: my comments are not pg-specific .. indeed I don't know
> much about pg recovery).
>
>
>>>>>>"Marty" == Marty Scholes <marty(at)outputservices(dot)com> writes:
>>>>>
>
> Marty> If the DB state cannot be put back to a consistent state
> Marty> prior to a SQL statement in the log, then NO amount of
> Marty> logging will help. The idea is that the state can be put
> Marty> back to what it was prior to a particular log entry, be it
> Marty> raw datafile blocks or a SQL statement.
>
> The point is that with redo logging, you can just blindly apply the
> log to the data pages in question, without even really restarting the
> database.
>
> Note that in ARIES, recovery follows: (1) analysis, (2) redo
> _everything_ since last checkpoint, (3) undo losers.
>
> So logging carefully will indeed help get the system to a consistent
> state - actually after phase (2) above the system will be in precisely
> the state during the crash .. and all that's left to do is undo all
> the live transactions (losers).
>
> BTW, logging raw datafile blocks would be pretty gross (physical
> logging) and so ARIES logs the changes to each tuple in "logical"
> fashion .. so if only one column changes only that value (before and
> after images) are logged. This is what's called "physiological
> logging".
>
> Marty> See above. If this cannot be resolved prior to
> Marty> re-executing a statement in the log, then the problem is
> Marty> beyond ANY subsequent logging.
>
> Not true ! By applying the log entries carefully you should be able to
> restore the system to a consistent state.
>
> >> Having said that, page-oriented undo logging can be a pain when
> >> B-tree pages split. For higher concurrency, ARIES uses logical
> >> undo logging. In this case, the logs are akin to your "parsed
> >> statement" idea.
> >>
>
> Marty> Yes, my experience exactly. Maybe we are the only company
> Marty> on the planet that experiences this sort of thing. Maybe
>
> Well, logical undo is still at a much lower level than parsed
> statements. Each logical undo log is something like "delete key 5 from
> index xyz".
>
> Marty> Maybe this is not a "traditional" RDBMS app, but I am not
> Marty> in the mood to write my own storage infrastructure for it.
>
> I agree that your app has a lot of updates .. it's just that I'm not
> convinced that logical logging is a clean solution.
>
> I also don't have a solution for your problem :-)
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2004-03-12 04:09:11 Re: Default Stats Revisited
Previous Message Sailesh Krishnamurthy 2004-03-12 03:20:28 Re: Performance and WAL on big inserts/updates