Re: [GSOC][weekly report 3] Eliminate O(N^2) scaling from rw-conflict tracking in serializable transactions

From: "Mengxing Liu" <liu-mx15(at)mails(dot)tsinghua(dot)edu(dot)cn>
To: "Heikki Linnakangas" <hlinnaka(at)iki(dot)fi>
Cc: "Alvaro Herrera" <alvherre(at)2ndquadrant(dot)com>, kgrittn <kgrittn(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [GSOC][weekly report 3] Eliminate O(N^2) scaling from rw-conflict tracking in serializable transactions
Date: 2017-06-21 08:30:21
Message-ID: 48acaab0.ff0b.15cc9c6158b.Coremail.liu-mx15@mails.tsinghua.edu.cn
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> From: "Heikki Linnakangas" <hlinnaka(at)iki(dot)fi>
>
> Hmm. The hash table ought to speed up the RWConflictExists() function
> right? Where in the flame graph is RWConflictExists()? If it only
> accounts for a small amount of the overall runtime, even drastic speedup
> there won't make much difference to the total.
>

Yes. It is much smaller than we have expected. I did a small test for HTAB and linked list:
when the set size is smaller than 10, linked list is as quick as HTAB. Here is the result.
(x microseconds running 10 million searching)

setSize: 5, LIST: 423569, HTAB: 523681
setSize: 10, LIST: 540579, HTAB: 430111
setSize: 15, LIST: 752879, HTAB: 429519
setSize: 20, LIST: 940792, HTAB: 431388
setSize: 25, LIST: 1163760, HTAB: 446043
setSize: 30, LIST: 1352990, HTAB: 429057
setSize: 35, LIST: 1524097, HTAB: 431547
setSize: 40, LIST: 1714976, HTAB: 429023

As we can see, the hash table can only benefit in a very extreme situation.
However, even if there are 100 concurrent connections, the average length of conflict
list is about 10. linked list is not the bottleneck.

>
> Nope, there is no such function, I'm afraid.
>

Oh, that's really a bad news.

> > In a previous email, I reported that many backends wait for the lock “SerializableFinishedListLock”;
> > If we don't implement functions like “ReleaseOneSerializableXact” quickly, they will be the bottleneck.
>
> Yeah, that's probably what's causing the decrease in throughput you're
> seeing.
>

An new evidence: I use "SELECT wait_event_type, wait_event FROM pg_stat_activity;" and sum by event_type to analyze the wait event.

The result of original version:
SerializableXactHashLock 27
predicate_lock_manager 512
WALWriteLock 3
SerializableFinishedListLock 402

The result of new version:
SerializableXactHashLock 38
predicate_lock_manager 473
WALWriteLock 3
SerializableFinishedListLock 1068

Obviously, more backends are blocked by SerializableFinishedListLock.

>
> You might need to also keep the linked lists, and use the hash table to
> only look up particular items in the linked list faster.
>
> I was surprised to see that you're creating a lot of smallish hash
> tables, three for each PredXact entry. I would've expected all the
> PredXact entries to share the same hash tables, i.e. have only three
> hash tables in total. That would be more flexible in allocating
> resources among entries. (It won't help with the quick-release, though)
>

Yes, it would looks more elegant and require less memory resources.
( because hash table objects also consume memory )
But just for performance, it would be less efficient than my patch.
Because it has to maintain linked lists, besides hash tables. In other words,
it does more works than my patch.

Another point is that removing linked list may have more opportunities to reduce
lock contentions. Otherwise, we need a global lock to protect the linked list.

--
Sincerely

Mengxing Liu

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2017-06-21 08:35:19 Comment typo in execMain.c
Previous Message Thomas Munro 2017-06-21 08:00:27 Re: ASOF join