From: | Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz> |
---|---|
To: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jan Wieck <JanWieck(at)Yahoo(dot)com>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Logical to physical page mapping |
Date: | 2012-10-27 21:12:59 |
Message-ID: | 508C4E5B.6090606@archidevsys.co.nz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 28/10/12 07:41, Heikki Linnakangas wrote:
> On 27.10.2012 16:43, Tom Lane wrote:
>> Jan Wieck<JanWieck(at)Yahoo(dot)com> writes:
>>> The reason why we need full_page_writes is that we need to guard
>>> against
>>> torn pages or partial writes. So what if smgr would manage a mapping
>>> between logical page numbers and their physical location in the
>>> relation?
>>
>>> At the moment where we today require a full page write into WAL, we
>>> would mark the buffer as "needs relocation". The smgr would then write
>>> this page into another physical location whenever it is time to
>>> write it
>>> (via the background writer, hopefully). After that page is flushed, it
>>> would update the page location pointer, or whatever we want to call it.
>>> A thus free'd physical page location can be reused, once the location
>>> pointer has been flushed to disk. This is a critical ordering of
>>> writes.
>>> First the page at the new location, second the pointer to the current
>>> location. Doing so would make write(2) appear atomic to us, which is
>>> exactly what we need for crash recovery.
>
> Hmm, aka copy-on-write.
>
>> I think you're just moving the atomic-write problem from the data pages
>> to wherever you keep these pointers.
>
> If the pointers are stored as simple 4-byte integers, you probably
> could assume that they're atomic, and won't be torn.
>
> There's a lot of practical problems in adding another level of
> indirection to every page access, though. It'll surely add some
> overhead to every access, even if the data never changes. And it's not
> at all clear to me that it would perform better than full_page_writes.
> You're writing and flushing out roughly the same amount of data AFAICS.
>
> What exactly is the problem with full_page_writes that we're trying to
> solve?
>
> - Heikki
>
>
Would a 4 byte pointer be adequate for a 64 bit machine with well over
4GB used by Postgres?
Cheers,
Gavin
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2012-10-27 22:16:39 | Re: proposal - assign result of query to psql variable |
Previous Message | Tom Lane | 2012-10-27 20:57:45 | Re: Logical to physical page mapping |