Re: Physical append-only tables

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Magnus Hagander <magnus(at)hagander(dot)net>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Physical append-only tables
Date: 2016-11-28 10:37:18
Message-ID: CAD21AoA9mN47k1mqkJP5Au=3oza3VX_oRH85b1x7qepZDrDrAw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 28, 2016 at 9:05 AM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:
> On 11/24/16 8:18 AM, Bruce Momjian wrote:
>>>
>>> What if we used BRIN to find heap pages where the new row was in the
>>> page BRIN min/max range, and the heap page had free space. Only if
>>> that
>>> fails do we put is somewhere else in the heap.
>>>
>>>
>>> That would certainly be useful. You'd have to figure out what to do in
>>> the case
>>> of multiple conflicting BRIN indexes (which you shouldn't have in the
>>> first
>>> place, but that won't keep people from having them), but other than that
>>> it
>>> would be quite good I think.
>>
>> This idea is only possible because the BRIN index is so small and easy
>> to scan, i.e. this wouldn't work for a btree index.
>
>
> ISTM a prerequisite for any of this is a way to override the default FSM
> behavior. A simple strategy that forces append-only would presumably be very
> cheap and easy to do after that. It could also be used to allow better
> clustering. It would also make it far easier to recover from a heavily
> bloated table that's too large to simply VACUUM FULL or CLUSTER, without
> resorting to the contortions that pg_repack/pg_reorg have to.

Since building BRIN index doesn't take a long time, it would be good
enough to support the online-clustering (or clustering with minimal
lock like pg_repack/pg_reorg does) in most cases. And I'm not sure
that there are a lot of users who build only BRIN index on the table.
I think that many users want to build other indexes on same table
other columns.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2016-11-28 11:03:38 Re: Quorum commit for multiple synchronous replication.
Previous Message Kyotaro HORIGUCHI 2016-11-28 10:25:15 Re: Radix tree for character conversion