Re: [PATCHES] GIN improvements

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] GIN improvements
Date: 2008-07-23 20:08:21
Message-ID: 8993.1216843701@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
> Updated: http://www.sigaev.ru/misc/fast_insert_gin-0.9.gz

Here is the GIN fast-insert patch back again. Changes:

* Sync with CVS HEAD
* Clean up documentation and some of the code comments
* Fix up custom reloptions code
* Suppress some compiler warnings

I didn't get much further than that because I got discouraged after
looking at the locking issues around the pending-insertions list.
It's a mess:

* shiftList() holds an exclusive lock on metapage throughout its run,
which means that it's impossible for two of them to run concurrently.
So why bother with "concurrent deletion" detection?

* shiftList does LockBufferForCleanup, which means that it can be blocked
for an indefinitely long time by a concurrent scan, and since it's holding
exclusive lock on metapage no new scans or insertions can start meanwhile.
This is not only horrid from a performance standpoint but it very probably
can result in deadlocks --- which will be deadlocks on LWLocks and thus
not even detected by the system.

* GIN index scans release lock and pin on one pending-list page before
acquiring pin and lock on the next, which means there's a race condition:
shiftList could visit and delete the next page before we get to it,
because there's a window where we're holding no buffer lock at all.
I think this isn't fatal in itself, since presumably the data in the next
page has been moved into the main index and we can scan it later, but the
scan code isn't checking whether the page has been deleted out from under
it.

* It seems also possible that once a list page has been marked
GIN_DELETED, it could be re-used for some other purpose before a
scan-in-flight reaches it -- reused either as a regular index page or as a
new list page. Neither case is being defended against. It might be that
the new-list-page case isn't a problem, or it might not.

* There is a bigger race condition, which is that after a scan has
returned a tuple from a pending page, vacuum could move the index entry
into the main index structure, and then that same scan could return that
same index entry a second time. This is a no-no, and I don't see any easy
fix.

I haven't really finished reviewing this code, but I'm going to bounce it
back to you to see if you can solve the locking problems. Unless that can
be made safe there is no point doing any more work on this patch.

regards, tom lane

Attachment Content-Type Size
fast_insert_gin-0.10.patch.gz application/octet-stream 17.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Manoel Henrique 2008-07-23 20:16:58 Research/Implementation of Nested Loop Join optimization
Previous Message Kevin Grittner 2008-07-23 19:11:06 Re: [PATCHES] odd output in restore mode

Browse pgsql-patches by date

  From Date Subject
Next Message Alvaro Herrera 2008-07-23 20:44:30 Re: [PATCHES] GIN improvements
Previous Message Kevin Grittner 2008-07-23 19:11:06 Re: [PATCHES] odd output in restore mode