Re: Simple improvements to freespace allocation

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Simple improvements to freespace allocation
Date: 2014-01-08 16:13:20
Message-ID: 6934.1389197600@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> writes:
> On 01/08/2014 08:56 AM, Simon Riggs wrote:
>> * IN-MEMORY
>> A large table may only have some of its blocks in memory. It would be
>> useful to force a NBT to be a block already in shared_buffers IFF a
>> table is above a certain size (use same threshold as seq scans, i.e.
>> 25% of shared_buffers). That may be difficult to achieve in practice,
>> so not sure about this one. Like it? Any ideas?

> Yeah, that seems nice, although I have feeling that it's not worth the
> complexity.

Not only would that be rather expensive to do, but I think it would be
self-defeating. Pages that are in memory would be particularly likely
to have been modified by someone else recently, so that the FSM's info
about their available space is stale, and thus once you actually got
to the page it'd be more likely to not have the space you need.

The existing FSM algorithm is intentionally designed to hand out pages
that nobody else has tried to insert into lately, with one goal being to
minimize the number of retries needed because of stale info. (Or at
least, it worked that way originally, and I don't think Heikki's rewrite
changed that aspect.) I'm concerned that the alternatives Simon proposes
would lead to more processes ganging up on the same pages, with not only a
direct cost in concurrency but an indirect cost in repeated FSM searches
due to believing stale available-space data. Indeed, a naive
implementation could easily get into infinite loops of handing back the
same page.

I realize that the point is exactly to sacrifice some insertion
performance in hopes of getting better table packing, but it's not clear
to me that there's an easy fix that makes packing better without a very
large hit on the other side.

Anyway, these fears could be proven or disproven with some benchmarking of
a trial patch, and I have no objection if Simon wants to do that
experimentation. But I'd be hesitant to see such a feature committed
in advance of experimental proof that it's actually useful.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2014-01-08 17:36:22 Re: Specifying both recovery_target_xid and recovery_target_time
Previous Message Andres Freund 2014-01-08 16:04:36 Re: Changeset Extraction Interfaces