GSoC 2017 Proposal for "Explicitly support predicate locks in index access methods besides btree"

From: anant khandelwal <anantbietec(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: GSoC 2017 Proposal for "Explicitly support predicate locks in index access methods besides btree"
Date: 2017-04-01 14:03:57
Message-ID: CAD=a8SBZWGD4U-TASJP0OwTPiLEffHoz5O5eV0Um6digqBVvfg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi guys,

My name is Anant Khandelwal currently i am pursuing masters from IIT -
Delhi and previously i am a software engineer.

I am particularly interested in working on the project "Explicitly support
predicate locks in index access methods besides b tree".I have gone through

http://hdl.handle.net/2123/5353
http://drkp.net/papers/ssi-vldb12.pdf

to understand how serializable implemented through the use of Snapshot
isolation in PostgreSQL also gone through the codebase get some idea of
different access methods for gin, gist, and hash indexes.
I want to discuss my idea.Then i move to make the proposal

Sorry for late i am busy with the academics.

From my findings the dependencies which create the anomalies in Snapshot
isolation are:

*wr-dependencies*: if T1 writes a version of an object, and T2
reads that version, then T1 appears to have executed before T2
* ww-dependencies*: if T1 writes a version of some object, and
T2 replaces that version with the next version, then T1 appears
to have executed before T2
*ww-dependencies*: if T1 writes a version of some object, and
T2 replaces that version with the next version, then T1 appears
to have executed before T2

where T1 and T2 are the transaction

but due to the rw dependency caused by any insert into the index indexes
other than the btree acquires relation level lock which causes
serialization failure.So the main task is to implement page-level predicate
locking in the remaining core index AMs

* Gist Index

B+tree index acquires predicate locks only on leaf pages as index tuples
with record pointers are stored on leaf pages. But for Gist index, we have
to consider locking at each index level as index tuples can be stored in
buffers associated with internal pages or in leaf pages.
So, the functions where we need to insert a call for

1. PredicateLockPage()

->gistScanPage()
after acquiring a shared lock on a buffer

2.CheckForSerializableConflictIn()

-> gistdoinsert()
after acquiring an exclusive lock on a target buffer and before inserting a
tuple

3. PredicateLockPageSplit()

->gistplacetopage()

If there is not enough space for insertion, we need to copy predicate lock
from an old page to all new pages generated after a successful split
operation.

PredicateLockPageSplit(Relation relation, BlockNumber oldblkno, BlockNumber
a lot) is used by b+-tree where two pages are involved in a split
operation. For Gist index, we can define a similar function where more
than one page can be generated after split operation.

* Gin Index

Gin index consists of a B-tree index over key values, where each key is an
element of some indexed items and each tuple in a leaf page contains either
a posting list if the list is small enough or a pointer to posting tree.

1. PredicateLockPage()

->startscanentry()
before calling collectMatchBitmap()

->scanPostingTree()
after acquiring a shared lock on a leaf page

2.CheckForSerializableConflictIn()

-> ginentryinsert()

->gininsertitempointers()
in case of insertion in a posting tree

3. PredicateLockPageSplit()

-> dataBeginPlacetoPageLeaf()

after calling dataPlacetoPageLeafSplit()

* Hash Index

1. PredicateLockPage()

->hashgettuple()
->_hash_first()
->_hash_readnext()
->_hash_readprev()

2.CheckForSerializableConflictIn()

-> _hash_doinsert()

3. PredicateLockPageSplit()

This is just an idea also i understand that Page level predicate locking
exists in the btree AM, where it required the addition of 17 function calls
to implement, but remains unimplemented in the gin, gist, spgist, brin, and
hash index AMs. So we nned to insert function calls at other places also.

Also tell me can we evaluate the performance on PostgreSQL on the
following workloads

- SIBENCH microbenchMark
- TPC-c++

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-04-01 14:19:05 Re: Multiple false-positive warnings from Valgrind
Previous Message Ashutosh Sharma 2017-04-01 12:47:02 Re: [sqlsmith] Failed assertion in _hash_kill_items/MarkBufferDirtyHint