Quick Links

Re: Fast insertion indexes: why no developments

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Peter Geoghegan <pg(at)heroku(dot)com>, Leonardo Francalanci <m_lists(at)yahoo(dot)it>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Fast insertion indexes: why no developments
Date:	2013-11-04 16:09:35
Message-ID:	CA+TgmoaSxhFjGVYJgWBQGVwH=EzEk+VJp5QSK9h-uZg+H+KkRA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Sat, Nov 2, 2013 at 6:07 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On 29 October 2013 16:10, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
>> On Tue, Oct 29, 2013 at 7:53 AM, Leonardo Francalanci <m_lists(at)yahoo(dot)it> wrote:
>>> I don't see much interest in insert-efficient indexes.
>>
>> Presumably someone will get around to implementing a btree index
>> insertion buffer one day. I think that would be a particularly
>> compelling optimization for us, because we could avoid ever inserting
>> index tuples that are already dead when the deferred insertion
>> actually occurs.
>
> That's pretty much what the LSM-tree is.

What is pretty cool about this sort of thing is that there's no
intrinsic reason the insertion buffer needs to be block-structured or
disk-backed. In theory, you can structure the in-memory portion of
the tree any way you like, using pointers and arbitrary-size memory
allocations and all that fun stuff. You need to log that there's a
deferred insert (or commit to flushing the insertion buffer before
every commit, which would seem to miss the point) so that recovery can
reconstruct the in-memory data structure and flush it, but that's it:
the WAL format need not know any other details of the in-memory
portion of the tree. I think that, plus the ability to use pointers
and so forth, might lead to significant performance gains.

In practice, the topology of our shared memory segment makes this a
bit tricky. The problem isn't so much that it's fixed size as that it
lacks a real allocator, and that all the space used for shared_buffers
is nailed down and can't be borrowed for other purposes. I'm very
interested in solving the problem of getting a real allocator for
shared memory because I think I need it for parallel sort, and even if
there's a way to avoid needing it there I have a strong feeling that
we'll want it for other applications of dynamic shared memory. But it
should be possible to write the allocator in such a way that you just
give it a chunk of memory to manage, and it does, remaining agnostic
about whether the memory is from the main shared memory segment or a
dynamic one.

Of course, it's possible that even we do get a shared memory
allocator, a hypothetical person working on this project might prefer
to make the data block-structured anyway and steal storage from
shared_buffers. So my aspirations in this area may not even be
relevant. But I wanted to mention them, just in case anyone else is
thinking about similar things, so that we can potentially coordinate.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: Fast insertion indexes: why no developments at 2013-11-02 10:07:44 from Simon Riggs

Responses

Re: Fast insertion indexes: why no developments at 2013-11-04 16:24:14 from Claudio Freire
Re: Fast insertion indexes: why no developments at 2013-11-04 20:01:59 from Simon Riggs
Re: Fast insertion indexes: why no developments at 2013-11-04 20:39:07 from Jeff Janes

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Robert Haas	2013-11-04 16:23:11	Re: dsm use of uint64
Previous Message	Andres Freund	2013-11-04 15:55:30	Re: dsm use of uint64