Quick Links

Re: Plans for solving the VACUUM problem

From:	Barry Lind <barry(at)xythos(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Plans for solving the VACUUM problem
Date:	2001-05-22 00:18:18
Message-ID:	3B09B04A.5060806@xythos.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Mikheev, Vadim wrote:

>
> Ok, last reminder -:))
>
> On transaction abort, read WAL records and undo (rollback)
> changes made in storage. Would allow:
>
> 1. Reclaim space allocated by aborted transactions.
> 2. Implement SAVEPOINTs.
> Just to remind -:) - in the event of error discovered by server
> - duplicate key, deadlock, command mistyping, etc, - transaction
> will be rolled back to the nearest implicit savepoint setted
> just before query execution; - or transaction can be aborted by
> ROLLBACK TO <savepoint_name> command to some explicit savepoint
> setted by user. Transaction rolled back to savepoint may be continued.
> 3. Reuse transaction IDs on postmaster restart.
> 4. Split pg_log into small files with ability to remove old ones (which
> do not hold statuses for any running transactions).
>
> Vadim

This is probably not a good thread to add my two cents worth, but here
goes anyway.

The biggest issue I see with the proposed UNDO using WAL is the issue of
large/long lasting transactions. While it might be possible to solve
this problem with some extra work. However keep in mind that different
types of transactions (i.e. normal vs bulk loads) require different
amounts of time and/or UNDO. To solve this problem, you really need per
transaction limits which seems difficult to implement.

I have no doubt that UNDO with WAL can be done. But is there some other
way of doing UNDO that might be just as good or better?

Part of what I see in this thread reading between the lines is that some
believe the solution to many problems in the long term is to implement
an overwriting storage manager. Implementing UNDO via WAL is a
necessary step in that direction. While others seem to believe that the
non-overwriting storage manager has some life in it yet, and may even be
the storage manager for many releases to come. I don't know enough
about the internals to have any say in that discussion, however the
grass isn't always greener on the other side of the fence (i.e. an
overwriting storage manager will come with its own set of problems/issues).

So let me throw out an idea for UNDO using the current storage manager.
First let me state that UNDO is a bit of a misnomer, since undo for
transactions is already implemented. That is what pg_log is all about.
The part of UNDO that is missing is savepoints (either explicit or
implicit), because pg_log doesn't capture the information for each
command in a transaction. So the question really becomes, how to
implement savepoints with current storage manager?

I am going to lay out one assumption that I am making:
1) Most transactions are either completely successful or completely
rolled back
(If this weren't true, i.e. you really needed savepoints to partially
rollback changes, you couldn't be using the existing version of
postgresql at all)

My proposal is:
1) create a new relation to store 'failed commands' for transactions.
This is similar to pg_log for transactions, but takes it to the
command level. And since it records only failed commands (or ranges of
failed commands), thus most transactions will not have any information
in this relation per the assumption above.
2) Use the unused pg_log status (3 = unused, 2 = commit, 1 = abort, 0
= inprocess) to mean that the transaction was commited but some commands
were rolled back (i.e. partial commit)
Again for the majority of transactions nothing will need to change,
since they will still be marked as committed or aborted.
3) Code that determines whether or not a tuple is committed or not
needs to be aware of this new pg_log status, and look in the new
relation to see if the particular command was rolled back or not to
determine the commited status of the tuple. This subtly changes the
meaning of HEAP_XMIN_COMMITTED and related flags to reflect the
transaction and command status instead of just the transaction status.

The runtime cost of this shouldn't be too high, since the committed
state is cached in HEAP_XMIN_COMMITTED et al, it is only the added cost
for the pass that needs to set these flags, and then there is only added
cost in the case that the transaction wasn't completely sucessful (again
my assumption above).

Now I have know idea if what I am proposing is really doable or not. I
am just throwing this out as an alternative to WAL based
UNDO/savepoints. The reason I am doing this is that to me it seems to
leverage much of the existing infrastructure already in place that
performs undo for rolledback transactions (all the tmin, tmax, cmin,
cmax stuff as well as vacuum). Also it doesn't come with the large WAL
log file problem for large transactions.

Now having said all of this I realize that this doesn't solve the 4
billion transaction id limit problem, or the large size of the pg_log
file with large numbers of transactions.

thanks,
--Barry

In response to

RE: Plans for solving the VACUUM problem at 2001-05-19 01:10:10 from Mikheev, Vadim

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bruce Momjian	2001-05-22 01:22:52	Re: More pgindent follies
Previous Message	Tulio Oliveira	2001-05-21 23:44:14	Prevent CREATE TABLE