Re: [Bizgres-general] WAL bypass for INSERT, UPDATE and

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Greg Stark <gsstark(at)mit(dot)edu>, Rod Taylor <pg(at)rbt(dot)ca>, Qingqing Zhou <zhouqq(at)cs(dot)toronto(dot)edu>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [Bizgres-general] WAL bypass for INSERT, UPDATE and
Date: 2006-01-05 17:22:28
Message-ID: 200601051722.k05HMSM02052@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Simon Riggs wrote:
> > So, we need a name for EXCLUSIVE mode that suggests how it is different
> > from TRUNCATE, and in this case, the difference is that EXCLUSIVE
> > preserves the previous contents of the table on recovery, while TRUNCATE
> > does not. Do you want to call the mode PRESERVE, or EXCLUSIVE WRITER?
> > Anyway, the keywords are easy to modify, even after the patch is
> > submitted. FYI, I usually go through keywords.c looking for a keyword
> > we already use.
>
> I'm very happy for suggestions on what these new modes are called.
>
> > > > So, to summarize, I think we should add DROP/TRUNCATE, and use that by
> > > > default (or optionally off?) in pg_dump, and, assuming we want EXCLUSIVE
> > > > for more than just COPY, we need to add ALTER TABLE EXCLUSIVE.
> > >
> > > Would you mind stating again what you mean, just so I can understand
> > > this? Your summary isn't enough.
> >
> > New ALTER TABLE mode, perhaps call it PERSISTENCE:
> >
> > ALTER TABLE tab PERSISTENCE DROP ON RECOVERY
> > ALTER TABLE tab PERSISTENCE TRUNCATE ON RECOVERY
> >
> > These would drop or truncate all tables with this flag on a non-clean
> > start of the postmaster, and write something in the server logs.
> > However, I don't know that we have the code in place to DROP/TRUNCATE in
> > recovery mode, and it would affect all databases, so it could be quite
> > complex to implement. In this mode, no WAL logs would be written for
> > table modifications, though DDL commands would have to be logged.
>
> Right now, this will be a TODO item... it looks like it will take some
> thought to implement correctly.

OK, I know my suggestions have made it more complicated.

TODO added:

* Allow control over which tables are WAL-logged

Allow tables to bypass WAL writes and just fsync() dirty pages on
commit. To do this, only a single writer can modify the table, and
writes must happen only on new pages. Readers can continue accessing
the table. This would affect COPY, and perhaps INSERT/UPDATE too.
Another option is to avoid transaction logging entirely and truncate
or drop the table on crash recovery. These should be implemented
using ALTER TABLE, e.g. ALTER TABLE PERSISTENCE [ DROP | TRUNCATE |
STABLE | DEFAULT ]. Tables using non-default logging should not use
referential integrity with default-logging tables, and tables using
stable logging probably can not have indexes. [walcontrol]

> > ALTER TABLE tab PERSISTENCE PRESERVE (or STABLE?)
> >
> > Table contents are preserved across recoveries, but data modifications
> > can happen only one at a time. I don't think we have a lock mode that
> > does this, so I am worried a new lock mode will have to be created. A
> > simplified solution at this stage would be to take an exclusive lock on
> > the table, but really we just need a single-writer table lock, which I
> > don't think we have. initially this can implemented to only affect COPY
> > but later can be done for other commands.
>
> ExclusiveLock locks out everything apart from readers, no new lock mode
> AFAICS. Implementing that is little additional work for COPY.

Nice.

> Tom had a concern about setting this for I, U, D commands via the
> executor. Not sure what the details of that are, as yet.

That is much more complicated than the COPY-only idea, for sure. I am
thinking we could add the ALTER syntax and just do COPY at this stage,
meaning that I/U/D still do full logging until we get to improving them.
The big benefit is that the user API doesn't need to change when we
improve the code. In fact I think we could do the TRUNCATE/DROP easily
for I/U/D, but the STABLE option would require work and we don't need to
implement it in the first patch.

> We can use either of the unlogged modes for pg_dump, so I'd suggest its
> this one. Everybody happy with this being the new default in pg_dump, or
> should it be an option?
>
> > ALTER TABLE tab PERSISTENCE DEFAULT
> >
> > This would be our current default mode, which is full concurrency and
> > persistence.
>
> I'm thinking whether the ALTER TABLE statement might be better with two
> bool flags rather than a 3-state char.
>
> flag 1: ENABLE LOGGING | DISABLE LOGGING
>
> flag 2: FULL RECOVERY | TRUNCATE ON RECOVERY
>
> Giving 3 possible sets of options:
>
> -- the default
> ALTER TABLE mytable ENABLE LOGGING FULL RECOVERY; (default)
>
> -- EXCLUSIVE mode
> ALTER TABLE mytable DISABLE LOGGING FULL RECOVERY;
> ...which would be used like this
> ALTER TABLE mytable DISABLE LOGGING;
> COPY or other bulk data manipulation SQL
> ALTER TABLE mytable ENABLE LOGGING;
> ...since FULL RECOVERY is the default.
>
> -- multiuser temp table mode
> ALTER TABLE mytable DISABLE LOGGING TRUNCATE ON RECOVERY;
> ...which would usually be left on all the time
>
> which only uses one new keyword LOGGING and yet all the modes are fairly
> explicit as to what they do.
>
> An alternative might be the slightly more verbose:
> ALTER TABLE mytable DISABLE LOGGING FORCE EXCLUSIVE TABLE LOCK;
> which would be turned off by
> ALTER TABLE mytable ENABLE LOGGING;
>
> Comments?

I had the same idea originally, but avoided it because the logging
really does affect what other options you can use. For example, if you
want truncate on recovery, you certainly do not want logging, so it
seems the options are not really independent. In fact if someone asks
for truncate on recovery, do we automatically turn off logging for them,
or throw an error, or a warning. It just seemed too error-prone and
confusing, though perhaps more logical. Of course, if others like the
above, we can do it.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2006-01-05 17:27:05 Re: [Bizgres-general] WAL bypass for INSERT, UPDATE and
Previous Message Bruce Momjian 2006-01-05 16:04:14 Re: postmaster/postgres options assimilation plan