Re: [PATCHES] GIN improvements

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] GIN improvements
Date: 2008-10-31 13:27:22
Message-ID: 490B07BA.6030109@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Reworked version of fast insertion patch for GIN.

>> * shiftList does LockBufferForCleanup, which means that it can be blocked
>> for an indefinitely long time by a concurrent scan, and since it's
>> holding
>> exclusive lock on metapage no new scans or insertions can start
>> meanwhile.
>> This is not only horrid from a performance standpoint but it very
>> probably
>> can result in deadlocks --- which will be deadlocks on LWLocks and thus
>> not even detected by the system.
> Ops, I see possible scenario: UPDATE tbl SET gin_indexed_field = ...
> where gin_indexed_field .... with concurrent shiftList. Will fix. Thank
> you.
Fixed, see below

>> * GIN index scans release lock and pin on one pending-list page before
>> acquiring pin and lock on the next, which means there's a race condition:
>> shiftList could visit and delete the next page before we get to it,
>> because there's a window where we're holding no buffer lock at all.
> Agree, will fix.
Fixed

>> * There is a bigger race condition, which is that after a scan has
>> returned a tuple from a pending page, vacuum could move the index entry
>> into the main index structure, and then that same scan could return that
>> same index entry a second time. This is a no-no, and I don't see any
>> easy
>> fix.
Fixed. TIDBitmap is used for that and for preventing deadlock mentioned above.
TIDBitmap is used for collectiong matched tuples from pending pages and after
that it used as filter for results from regular GIN's scan.

Patch extends TIDBitmap interface by 2 functions:
bool tbm_check_tuple(TIDBitmap *tbm, const ItemPointer tid);
returns true if tid already exists in bitmap
bool tbm_has_lossy(TIDBitmap *tbm);
returns true if bitmap becomes lossy

Also, sequential scan on pending page is replaced to binary search for
performance. It's not a big win but it might improve performance.
--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

Attachment Content-Type Size
fast_insert_gin-0.15.gz application/x-tar 17.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2008-10-31 13:38:02 Re: WIP patch: convert SQL-language functions to return tuplestores
Previous Message Heikki Linnakangas 2008-10-31 13:15:35 Re: Synchronous replication patch v1

Browse pgsql-patches by date

  From Date Subject
Next Message Nikhil Sontakke 2008-10-31 14:50:46 Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1
Previous Message Nikhil Sontakke 2008-10-23 12:41:41 Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1