Re: LSM tree for Postgres

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: LSM tree for Postgres
Date: 2020-08-04 15:18:37
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


* Tomas Vondra (tomas(dot)vondra(at)2ndquadrant(dot)com) wrote:
> On Tue, Aug 04, 2020 at 11:22:13AM +0300, Konstantin Knizhnik wrote:
> >two top indexes used in cyclic way and one main index. When top index
> >reaches some threshold value
> >we initiate merge with main index, done by bgworker and switch to another
> >top index.
> >As far as merging indexes is done in background, it doesn't  affect insert
> >speed.
> >Unfortunately Postgres Index AM has not bulk insert operation, so we have
> >to perform normal inserts.
> >But inserted data is already sorted by key which should improve access
> >locality and partly solve random reads problem for base index.
> >
> >Certainly to perform search in Lsm3 we have to make lookups in all three
> >indexes and merge search results.
> >(in case of unique index we can avoid extra searches if searched value is
> >found in top index).
> >It can happen that during merge of top and base indexes the same TID can
> >be found in both of them.
> >But such duplicates can be easily eliminated during merge of search results.
> >
> >As far as we are using standard nbtree indexes there is no need to worry
> >about logging information in WAL.
> >There is no need to use inefficient "generic WAL records" or patch kernel
> >by adding own WAL records.
> Makes sense, I guess. I always imagined we could do something like this
> by adding "buffers" into the btree directly, and instead of pushing them
> all the way down to the leaf pages we'd only insert them into the first
> buffer (and full buffers would get "propagated" in the background).

I get that it's not quite the same, but this all is reminding me of the
GIN pending list and making me wonder if there's some way to generalize
that (or come up with something new that would work for GIN too).

Independently while considering this, I don't think the issues around
how to deal with unique btrees properly has really been considered- you
certainly can't stop your search on the first tuple you find even if the
index is unique, since the "unique" btree could certainly have multiple
entries for a given key and you might need to find a different one.



In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2020-08-04 15:24:39 Re: LSM tree for Postgres
Previous Message Tomas Vondra 2020-08-04 15:17:43 Re: WIP: BRIN multi-range indexes