Re: Reducing tuple overhead

From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Bruce Momjian <bruce(at)momjian(dot)us>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Subject: Re: Reducing tuple overhead
Date: 2015-05-01 22:55:48
Message-ID: 55440474.4080507@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 4/30/15 7:37 AM, Robert Haas wrote:
> On Thu, Apr 30, 2015 at 8:05 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> A much better idea is to work out how to avoid index bloat at cause. If we
>> are running an UPDATE and we cannot get a cleanup lock, we give up and do a
>> non-HOT update, causing the index to bloat. It seems better to wait for a
>> short period to see if we can get the cleanup lock. The short period is
>> currently 0, so lets start there and vary the duration of wait upwards
>> proportionally as the index gets more bloated.

That only happens if there already wasn't enough space on the page so we
need to Defrag, yes? If there is enough space we can HOT update without
the cleanup lock.

What would be useful to know is how often we abort a HOT update because
of lack of free space; that would indicate to a DBA that a lower fill
factor may be in oredr. What would be useful to -hackers would be stats
on how often an update would have been HOT if only the page had been pruned.

> What I'd be worried about there is that it would be very hard to tune
> the wait time, and that the operating system scheduling granularity
> (10ms?) would be way too long.

[1] indicates between 0.75 and 6ms by default on Linux. I think FBSD
still uses a 1000Hz scheduler (1ms), but it's not as clear.

What might be more promising is ways to avoid holding a pin for a long
time (like the outer side of a nested loop), or being more aggressive
about attempting the lock (IE: lower the threshold to trigger cleaning).

There's also a (in hindsight) questionable bit of logic in
heap_page_prune_opt(); once we get the cleanup lock we check the page
free space a second time. If we managed to actually get the lock, we
should probably just clean it anyway.

> But I'm in vigorous agreement with you on one point: the solution to
> index bloat (and probably heap bloat, too) is not to clean it up
> faster but to create less of it in the first place. Making more
> updates HOT is one way to do that.

+1.

1:
http://stackoverflow.com/questions/16401294/how-to-know-linux-scheduler-time-slice
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2015-05-01 22:59:42 Re: CTE optimization fence on the todo list?
Previous Message Peter Geoghegan 2015-05-01 22:32:44 Re: CTE optimization fence on the todo list?