Re: GSoC proposal - "make an unlogged table logged"

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Fabrízio Mello <fabriziomello(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GSoC proposal - "make an unlogged table logged"
Date: 2014-03-04 14:47:08
Message-ID: CA+Tgmob44LNwwU73N1aJsGQyzQ61SdhKJRC_89wCm0+aLg=x2Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 3, 2014 at 12:08 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> * Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
>> On Mon, Mar 3, 2014 at 11:28 AM, Fabrízio de Royes Mello
>> <fabriziomello(at)gmail(dot)com> wrote:
>> > Is the TODO item "make an unlogged table logged" [1] a good GSoC project?
>>
>> I'm pretty sure we found some problems in that design that we couldn't
>> figure out how to solve. I don't have a pointer to the relevant
>> -hackers discussion off-hand, but I think there was one.
>
> ISTR the discussion going something along the lines of "we'd have to WAL
> log the entire table to do that, and if we have to do that, what's the
> point?".

No, not really. The issue is more around what happens if we crash
part way through. At crash recovery time, the system catalogs are not
available, because the database isn't consistent yet and, anyway, the
startup process can't be bound to a database, let alone every database
that might contain unlogged tables. So the sentinel that's used to
decide whether to flush the contents of a table or index is the
presence or absence of an _init fork, which the startup process
obviously can see just fine. The _init fork also tells us what to
stick in the relation when we reset it; for a table, we can just reset
to an empty file, but that's not legal for indexes, so the _init fork
contains a pre-initialized empty index that we can just copy over.

Now, to make an unlogged table logged, you've got to at some stage
remove those _init forks. But this is not a transactional operation.
If you remove the _init forks and then the transaction rolls back,
you've left the system an inconsistent state. If you postpone the
removal until commit time, then you have a problem if it fails,
particularly if it works for the first file but fails for the second.
And if you crash at any point before you've fsync'd the containing
directory, you have no idea which files will still be on disk after a
hard reboot.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-03-04 14:48:33 Re: Custom Scan APIs (Re: Custom Plan node)
Previous Message Robert Haas 2014-03-04 14:30:52 Re: requested shared memory size overflows size_t