Re: Logical to physical page mapping

From: Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jan Wieck <JanWieck(at)Yahoo(dot)com>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Logical to physical page mapping
Date: 2012-10-27 21:12:59
Message-ID: 508C4E5B.6090606@archidevsys.co.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 28/10/12 07:41, Heikki Linnakangas wrote:
> On 27.10.2012 16:43, Tom Lane wrote:
>> Jan Wieck<JanWieck(at)Yahoo(dot)com> writes:
>>> The reason why we need full_page_writes is that we need to guard
>>> against
>>> torn pages or partial writes. So what if smgr would manage a mapping
>>> between logical page numbers and their physical location in the
>>> relation?
>>
>>> At the moment where we today require a full page write into WAL, we
>>> would mark the buffer as "needs relocation". The smgr would then write
>>> this page into another physical location whenever it is time to
>>> write it
>>> (via the background writer, hopefully). After that page is flushed, it
>>> would update the page location pointer, or whatever we want to call it.
>>> A thus free'd physical page location can be reused, once the location
>>> pointer has been flushed to disk. This is a critical ordering of
>>> writes.
>>> First the page at the new location, second the pointer to the current
>>> location. Doing so would make write(2) appear atomic to us, which is
>>> exactly what we need for crash recovery.
>
> Hmm, aka copy-on-write.
>
>> I think you're just moving the atomic-write problem from the data pages
>> to wherever you keep these pointers.
>
> If the pointers are stored as simple 4-byte integers, you probably
> could assume that they're atomic, and won't be torn.
>
> There's a lot of practical problems in adding another level of
> indirection to every page access, though. It'll surely add some
> overhead to every access, even if the data never changes. And it's not
> at all clear to me that it would perform better than full_page_writes.
> You're writing and flushing out roughly the same amount of data AFAICS.
>
> What exactly is the problem with full_page_writes that we're trying to
> solve?
>
> - Heikki
>
>
Would a 4 byte pointer be adequate for a 64 bit machine with well over
4GB used by Postgres?

Cheers,
Gavin

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2012-10-27 22:16:39 Re: proposal - assign result of query to psql variable
Previous Message Tom Lane 2012-10-27 20:57:45 Re: Logical to physical page mapping