Re: Review: GIN non-intrusive vacuum of posting tree

From: Andrew Borodin <borodin(at)octonica(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Vladimir Borodin <root(at)simply(dot)name>
Subject: Re: Review: GIN non-intrusive vacuum of posting tree
Date: 2017-01-23 09:55:55
Message-ID: CAJEAwVF+GRpOhRaRNYp4c_4Gf4dsh2g76-6fEX8D7UeCBJvq3w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Hi!

2017-01-23 11:32 GMT+05:00 Jeff Davis <pgsql(at)j-davis(dot)com>:
> I have some reservations about the complexity and novelty of the
> approach. I don't think I've seen an approach quite like this one
> before. It seems like the main reason you are using this approach is
> because the btree structure was too simple (no left links); is that
> right? My gut is telling me we should either leave it simple, or use a
> real concurrent btree implementation.

GIN B-tree is departed from backend\access\nbtree almost like nbtree
is departed from L&Y algorithm. As far as I understand there is no
"high key" concept as in nbtree. I'm not sure that it's possible to
implement same level of cuncurrency as in nbtree without that.
Also GIN B-tree is actually two different B-trees, large portions of
code is shared between Entries tree and Postings tree. I'm not sure
going fully concurrent vacuum worth it, there will be a lot of
changes. And then there is compression...installing high keys into GIN
may be a mess, with pg_upgrade.

Proposed patch, on a contrary, simplifies code. There is more code
deleted than added. It does not affect WAL, changes nothing in page
layout.

> One idea I had that might be simpler is to use a two-stage page
> delete. The first stage would remove the link from the parent and mark
> the page deleted, but leave the right link intact and prevent
> recycling. The second stage would follow the chain of right links
> along each level, removing the right links to deleted pages and
> freeing the page to be recycled.

As far as I recall, this is resemplant to Lenin and Shasha algorithm.
I'll look into it, but I think it relies on concept of "high key" on a
page somehow (high key is the key from a sibling page, not local
rightmost key as GIN B-tree uses).

> Another idea is to partition the posting tree so that the root page
> actually points to K posting trees. Scans would need to merge them,
> but it would allow inserts to progress in one tree while the other is
> being vacuumed. I think this would give good concurrency even for K=2.
> I just had this idea now, so I didn't think it through very well.

This is efficient from a point of view of frequent vacuums.
concurrency of same key inserts can benefit from this approach. But
all aother operations are more complicated and less efficient.
GIN already has Fast Update list to tackle problems in this manner.
But I do not feel I have resources to try to implement it...

Thank you for your considerations. I'll think about concurrency and
"high keys" more, but I'm realy afraid of amount of changes which may
be prerequisites.

Best regards, Andrey Borodin.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2017-01-23 09:56:07 Re: postgres_fdw bug in 9.6
Previous Message Kuntal Ghosh 2017-01-23 09:51:40 Re: Gather Merge