Re: Dead Space Map version 2

From: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Dead Space Map version 2
Date: 2007-02-27 05:11:44
Message-ID: 20070227051144.GK29041@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches pgsql-performance

On Tue, Feb 27, 2007 at 12:05:57PM +0900, ITAGAKI Takahiro wrote:
> Each heap pages have 4 states for dead space map; HIGH, LOW, UNFROZEN and
> FROZEN. VACUUM uses the states to reduce the number of target pages.
>
> - HIGH : High priority to vacuum. Maybe many dead tuples in the page.
> - LOW : Low priority to vacuum Maybe few dead tuples in the page.
> - UNFROZEN : No dead tuples, but some unfrozen tuples in the page.
> - FROZEN : No dead nor unfrozen tuples in the page.
>
> If we do UPDATE a tuple, the original page containing the tuple is marked
> as HIGH and the new page where the updated tuple is placed is marked as LOW.

Don't you mean UNFROZEN?

> When we commit the transaction, the updated tuples needs only FREEZE.
> That's why the after-page is marked as LOW. However, If we rollback, the
> after-page should be vacuumed, so we should mark the page LOW, not UNFROZEN.
> We don't know the transaction will commit or rollback at the UPDATE.

What makes it more important to mark the original page as HIGH instead
of LOW, like the page with the new tuple? The description of the states
indicates that there would likely be a lot more dead tuples in a HIGH
page than in a LOW page.

Perhaps it would be better to have the bgwriter take a look at how many
dead tuples (or how much space the dead tuples account for) when it
writes a page out and adjust the DSM at that time.

> * Agressive freezing
> We will freeze tuples in dirty pages using OldestXmin but FreezeLimit.
> This is for making FROZEN pages but not UNFROZEN pages as far as possible
> in order to reduce works in XID wraparound vacuums.

Do you mean using OldestXmin instead of FreezeLimit?

Perhaps it might be better to save that optimization for later...

> In current implementation, DSM allocates a bunch of memory at start up and
> we cannot modify it in running. It's maybe enough because DSM consumes very
> little memory -- 32MB memory per 1TB database.
>
> There are 3 parameters for FSM and DSM.
>
> - max_fsm_pages = 204800
> - max_fsm_relations = 1000 (= max_dsm_relations)
> - max_dsm_pages = 4096000
>
> I'm thinking to change them into 2 new paramaters. We will allocates memory
> for DSM that can hold all of estimated_database_size, and for FSM 50% or
> something of the size. Is this reasonable?

I don't think so, at least not until we get data from the field about
what's typical. If the DSM is tracking every page in the cluster then
I'd expect the FSM to be closer to 10% or 20% of that, anyway.

> I've already have a recovery extension. However, it can recover DSM
> but not FSM. Do we also need to restore FSM? If we don't, unreusable
> pages might be left in heaps. Of cource it could be reused if another
> tuple in the page are updated, but VACUUM will not find those pages.

Yes, DSM would make FSM recovery more important, but I thought it was
recoverable now? Or is that only on a clean shutdown?

I suspect we don't need perfect recoverability... theoretically we could
just commit the FSM after vacuum frees pages and leave it at that; if we
revert to that after a crash, backends will grab pages from the FSM only
to find there's no more free space, at which point they could pull the
page from the FSM and find another one. This would lead to degraded
performance for a while after a crash, but that might be a good
trade-off.
--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim C. Nasby 2007-02-27 05:12:44 Re: Expanding DELETE/UPDATE returning
Previous Message Jim C. Nasby 2007-02-27 04:50:56 Re: COMMIT NOWAIT Performance Option

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2007-02-27 05:55:21 Re: Dead Space Map version 2
Previous Message Chris Marcellino 2007-02-27 05:00:09 POSIX shared memory support

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2007-02-27 05:55:21 Re: Dead Space Map version 2
Previous Message Shane Ambler 2007-02-27 05:05:13 Re: Two hard drives --- what to do with them?