Re: Dead Space Map version 2

From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "ITAGAKI Takahiro" <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Dead Space Map version 2
Date: 2007-02-27 08:11:37
Message-ID: 1172563897.3760.540.camel@silverbirch.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches pgsql-performance

On Tue, 2007-02-27 at 12:05 +0900, ITAGAKI Takahiro wrote:

> If we combine this with the HOT patch, pages with HOT tuples are probably
> marked as UNFROZEN because we don't bother vacuuming HOT tuples. They can
> be removed incrementally and doesn't require explicit vacuums.

Perhaps avoid DSM entries for HOT updates completely?

> VACUUM commands
> ---------------
>
> VACUUM now only scans the pages that possibly have dead tuples.
> VACUUM ALL, a new syntax, behaves as the same as before.
>
> - VACUUM FULL : Not changed. scans all pages and compress them.
> - VACUUM ALL : Scans all pages; Do the same behavior as previous VACUUM.
> - VACUUM : Scans only HIGH pages usually, but also LOW and UNFROZEN
> pages on vacuums in the cases for preventing XID wraparound.

Sounds good.

> Performance issues
> ------------------
>
> * Enable/Disable DSM tracking per tables
> DSM requires more or less additional works. If we know specific tables
> where DSM does not work well, ex. heavily updated small tables, we can
> disable DSM for it. The syntax is:
> ALTER TABLE name SET (dsm=true/false);

How about a dsm_tracking_limit GUC? (Better name please)
The number of pages in a table before we start tracking DSM entries for
it. DSM only gives worthwhile benefits for larger tables anyway, so let
the user define what large means for them.
dsm_tracking_limit = 1000 by default.

> * Dead Space State Cache
> The DSM management module is guarded using one LWLock, DeadSpaceLock.
> Almost all accesses to DSM requires only shared lock, but the frequency
> of shared lock was very high (tied with BufMappingLock) in my research.
> To avoid the lock contention, I added a cache of dead space state in
> BufferDesc flags. Backends see the flags first, and avoid locking if no
> need to

ISTM there should be a point at which DSM is so full we don't bother to
keep track any longer, so we can drop that information. For example if
user runs UPDATE without a WHERE clause, there's no point in tracking
whole relation.

> Memory management
> -----------------
>
> In current implementation, DSM allocates a bunch of memory at start up and
> we cannot modify it in running. It's maybe enough because DSM consumes very
> little memory -- 32MB memory per 1TB database.

That sounds fine.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2007-02-27 08:16:54 Re: Dead Space Map version 2
Previous Message ITAGAKI Takahiro 2007-02-27 08:08:39 Re: [PATCHES] Load distributed checkpoint

Browse pgsql-patches by date

  From Date Subject
Next Message Simon Riggs 2007-02-27 08:16:54 Re: Dead Space Map version 2
Previous Message ITAGAKI Takahiro 2007-02-27 08:08:39 Re: [PATCHES] Load distributed checkpoint

Browse pgsql-performance by date

  From Date Subject
Next Message Simon Riggs 2007-02-27 08:16:54 Re: Dead Space Map version 2
Previous Message Simon Riggs 2007-02-27 07:49:01 Re: Dead Space Map version 2