Re: Keeping separate WAL segments for each database

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Devrim GÜNDÜZ <devrim(at)gunduz(dot)org>, PostgreSQL Hackers ML <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Keeping separate WAL segments for each database
Date: 2010-07-01 00:52:18
Message-ID: AANLkTilmJhISCGI3VJdH4X7_DiZ0Rxqtojyan0OIgltr@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2010/6/30 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> (thinks some more...)  Maybe you don't even need the fencepoint record
> per se.  I think all it's doing for you is making sure you don't process
> commit records on different streams out-of-order.  There might be some
> other, more direct way to do that.
>
> (thinks yet more...)  Actually the weak point in this scheme is that it
> wouldn't serialize transactions that occur in different databases and
> don't touch any shared catalogs.  It'd be entirely possible for T1 in
> DB1 to be reported committed, then T2 in DB2 to be reported committed,
> then a crash occurs after which T2 is seen committed and T1 not.  While
> this would be all right if the clients for T1 and T2 can't communicate,
> that isn't the real world.

Eh? If T1 and T2 are both reported committed, then they'll still be
committed after crash recovery, assuming synchronous_commit is turned
on. If not, our ACID has no D. Still, I suspect you're right that
there are serialization anomalies buried in here somewhere that can't
happen today.

And at any rate, the per-database thing isn't really the design goal,
anyway. It would be much nicer if we could find a way to support N>1
WAL streams without requiring that they be segregated by database.
We'd like to be able to write WAL faster, and commit faster, during
normal operation, and recover more quickly during recovery, especially
archive recovery.

You need to make sure not only that you replay commit records in
order, but also that, for example, you don't replay an
XLOG_HEAP2_CLEAN record too early.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joe Conway 2010-07-01 02:04:15 Re: Keeping separate WAL segments for each database
Previous Message Tom Lane 2010-06-30 23:41:56 Re: Check constraints on non-immutable keys