Re: Avoiding repeated snapshot computation

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Avoiding repeated snapshot computation
Date: 2012-08-17 01:02:10
Message-ID: 20120817010210.GK30286@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Did we ever make a decision on this patch?

---------------------------------------------------------------------------

On Sat, Nov 26, 2011 at 09:22:50PM +0530, Pavan Deolasee wrote:
> On some recent benchmarks and profile data, I saw GetSnapshotData
> figures at the very top or near top. For lesser number of clients, it
> can account for 10-20% of time, but more number of clients I have seen
> it taking up as much as 40% of sample time. Unfortunately, the machine
> of which I was running these tests is currently not available and so I
> don't have the exact numbers. But the observation is almost correct.
> Our recent work on separating the hot members of PGPROC in a separate
> array would definitely reduce data cache misses ans reduce the
> GetSnapshotData time, but it probably still accounts for a large
> enough critical section for a highly contended lock.
>
> I think now that we have reduced the run time of the function itself,
> we should now try to reduce the number of times the function is
> called. Robert proposed a way to reduce the number of calls per
> transaction. I think we can go one more step further and reduce the
> number for across the transactions.
>
> One major problem today could be because the way LWLock works. If the
> lock is currently held in SHARED mode by some backend and some other
> backend now requests it in SHARED mode, it will immediately get it.
> Thats probably the right thing to do because you don't want the reader
> to really wait when the lock is readily available. But in the case of
> GetSnapshotData(), every reader is doing exactly the same thing; they
> are computing a snapshot based on the same shared state and would
> compute exactly the same snapshot (if we ignore the fact that we don't
> include caller's XID in xip array, but thats a minor detail). And
> because the way LWLock works, more and more readers would get in to
> compute the snapshot, until the exclusive waiters get a window to
> sneak in, either because more and more processes slowly start sleeping
> for exclusive access. To depict it, the four transactions make
> overlapping calls for GetSnapshotData() and hence the total critical
> section starts when the first caller enters it and the ends when the
> last caller exits.
>
> Txn1 ------[ SHARED ]---------------------
> Txn2 --------[ SHARED ]-------------------
> Txn3 -----------------[ SHARED ]-------------
> Txn4 -------------------------------------------[ SHARED
> ]---------
> |<---------------Total Time ------------------------------------>|
>
> Couple of ideas come to mind to solve this issue.
>
> A snapshot once computed will remain valid for every call irrespective
> of its origin until at least one transaction ends. So we can store the
> last computed snapshot in some shared area and reuse it for all
> subsequent GetSnapshotData calls. The shared snapshot will get
> invalidated when some transaction ends by calling
> ProcArrayEndTransaction(). I tried this approach and saw a 15%
> improvement for 32-80 clients on the 32 core HP IA box with pgbench -s
> 100 -N tests. Not bad, but I think this can be improved further.
>
> What we can do is when a transaction comes to compute its snapshot, it
> checks if some other transaction is already computing a snapshot for
> itself. If so, it just sleeps on the lock. When the other process
> finishes computing the snapshot, it saves the snapshot is a shared
> area and wakes up all processes waiting for the snapshot. All those
> processes then just copy the snapshot from the shared area and they
> are done. This will not only reduce the total CPU consumption by
> avoiding repetitive work, but would also reduce the total time for
> which ProcArrayLock is held in SHARED mode by avoiding pipeline of
> GetSnapshotData calls. I am currently trying the shared work queue
> mechanism to implement this, but I am sure we can do it this in some
> other way too.
>
> Thanks,
> Pavan
>
> --
> Pavan Deolasee
> EnterpriseDB     http://www.enterprisedb.com
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2012-08-17 01:05:24 Re: Large number of open(2) calls with bulk INSERT into empty table
Previous Message Bruce Momjian 2012-08-17 00:59:16 Re: Not HOT enough