Re: FSM versus GIN pending list bloat

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FSM versus GIN pending list bloat
Date: 2015-08-04 19:38:13
Message-ID: CAMkU=1yG+E1uN6=ouXX4OQjQQaCdfbFUz9Ypyp0yyYvzRgg3Gw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 4, 2015 at 1:39 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:

> On 4 August 2015 at 06:03, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>
>
>> The attached proof of concept patch greatly improves the bloat for both
>> the insert and the update cases. You need to turn on both features: adding
>> the pages to fsm, and vacuuming the fsm, to get the benefit (so JJ_GIN=3).
>> The first of those two things could probably be adopted for real, but the
>> second probably is not acceptable. What is the right way to do this?
>> Could a variant of RecordFreeIndexPage bubble the free space up the map
>> immediately rather than waiting for a vacuum? It would only have to move
>> up until it found a page with freespace already recorded in it, which the
>> vast majority of the time would mean observing up one level and then not
>> writing to it, assuming the pending list pages remain well clustered.
>>
>
> You make a good case for action here since insert only tables with GIN
> indexes on text are a common use case for GIN.
>
> Why would vacuuming the FSM be unacceptable? With a
> large gin_pending_list_limit it makes sense.
>

But with a smallish gin_pending_list_limit (like the default 4MB) this
could be called a lot (multiple times a second during some spurts), and
would read the entire fsm each time.

>
> If it is unacceptable, perhaps we can avoid calling it every time, or
> simply have FreeSpaceMapVacuum() terminate more quickly on some kind of
> 80/20 heuristic for this case.
>

Or maybe it could be passed a range of blocks which need vacuuming, so it
concentrated on that range.

But from the README file, it sounds like it is already supposed to be
bubbling up. I'll have to see just whats going on there when I get a
chance.

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2015-08-04 19:41:34 Re: More work on SortSupport for text - strcoll() and strxfrm() caching
Previous Message Andres Freund 2015-08-04 19:30:19 Re: Raising our compiler requirements for 9.6