Skip site navigation (1) Skip section navigation (2)

Re: Proposal of tunable fix for scalability of 8.4

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Scott Carey <scott(at)richrelevance(dot)com>
Cc: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Proposal of tunable fix for scalability of 8.4
Date: 2009-03-13 03:00:38
Message-ID: alpine.GSO.2.01.0903122152120.16050@westnet.com (view raw or flat)
Thread:
Lists: pgsql-performance
On Thu, 12 Mar 2009, Scott Carey wrote:

> Furthermore, if the problem was due to too much concurrency in the 
> database with active connections, its hard to see how changing the lock 
> code would change the result the way it did ?

What I wonder about is if the locking mechanism is accidentally turning 
into a CPU resource scheduling problem on this benchmark.  If the 
connections were pooled instead, control over that scheduling would be 
more explicit, because connections would more directly map onto physical 
CPUs.  What if the fall-off is because the sum of the working code set 
here is simply exceeding the sum of the CPU caching available once the 
number of active connections gets big enough?  The real problem could be 
that the connections waiting on ProcArray are just falling out of cache, 
such that when they do wake up they take a while to page back in and keep 
going.

I wouldn't actually bet anything on that theory though, or any of the 
others offered here.  I find wandering into performance bottleneck 
analysis presuming you know what's going on to be dangerous.  The bigger 
issue here is that Jignesh is using a configuration known to be 
problematic (lots of connections), which introduces some uncertaintly 
about the true root cause here.  Whether it's well founded or not, it 
still hurts his case.

And to step back for a second, after reading up on it again I see that 
Sun's internal iGen-OLTP benchmark "stresses lock management and 
connectivity"[1], which makes me wonder even more than I did before about 
how specific this fix is to this workload.

[1] http://blogs.sun.com/bmseer/entry/t2000_adds_database_leadership_to

> First just run a test with a tiny delay (5ms? 0?) and fewer users to 
> compare.  If your theory that a connection pooler would help, that test 
> would provide higher throughput with low user count and not be lock 
> limited.

If the symptoms stay the same but are just scaled to a much lower 
connection count, that might help rule out some types of context switching 
and caching problem from the list of most likely suspects.  Might as well 
make it 0ms to minimize the number of connections.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD
>From pgsql-performance-owner(at)postgresql(dot)org  Fri Mar 13 01:23:45 2009
Received: from localhost (unknown [200.46.208.211])
	by mail.postgresql.org (Postfix) with ESMTP id 7C52663584A
	for <pgsql-performance-postgresql(dot)org(at)mail(dot)postgresql(dot)org>; Fri, 13 Mar 2009 01:23:44 -0300 (ADT)
Received: from mail.postgresql.org ([200.46.204.86])
 by localhost (mx1.hub.org [200.46.208.211]) (amavisd-maia, port 10024)
 with ESMTP id 46725-06-4
 for <pgsql-performance-postgresql(dot)org(at)mail(dot)postgresql(dot)org>;
 Fri, 13 Mar 2009 01:23:39 -0300 (ADT)
X-Greylist: from auto-whitelisted by SQLgrey-1.7.6
Received: from westnet.com (westnet.com [216.187.52.2])
	by mail.postgresql.org (Postfix) with ESMTP id 87C6763EF14
	for <pgsql-performance(at)postgresql(dot)org>; Fri, 13 Mar 2009 00:31:06 -0300 (ADT)
Received: from westnet.com (localhost [127.0.0.1])
	by westnet.com (8.14.0/8.14.0) with ESMTP id n2D3V5tP029355;
	Thu, 12 Mar 2009 23:31:05 -0400 (EDT)
Received: from localhost (gsmith(at)localhost)
	by westnet.com (8.14.0/8.13.2/Submit) with ESMTP id n2D3V5QC029351;
	Thu, 12 Mar 2009 23:31:05 -0400 (EDT)
X-Authentication-Warning: westnet.com: gsmith owned process doing -bs
Date: Thu, 12 Mar 2009 23:31:05 -0400 (EDT)
From: Greg Smith <gsmith(at)gregsmith(dot)com>
X-X-Sender: gsmith(at)westnet(dot)com
To: "Jignesh K. Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
cc: Scott Carey <scott(at)richrelevance(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>,
        Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>,
        "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Proposal of tunable fix for scalability of 8.4
In-Reply-To: <49B9566C(dot)3010708(at)sun(dot)com>
Message-ID: <alpine(dot)GSO(dot)2(dot)01(dot)0903122322250(dot)16050(at)westnet(dot)com>
References: <C5DE98EC(dot)3381%scott(at)richrelevance(dot)com> <49B9566C(dot)3010708(at)sun(dot)com>
User-Agent: Alpine 2.01 (GSO 1184 2008-12-16)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Virus-Scanned: Maia Mailguard 1.0.1
X-Spam-Status: No, hits=0 tagged_above=0 required=5 tests=none
X-Spam-Level: 
X-Archive-Number: 200903/148
X-Sequence-Number: 33070

On Thu, 12 Mar 2009, Jignesh K. Shah wrote:

> That said the testkit that I am using is a lightweight OLTP typish 
> workload which a user runs against a preknown schema and between various 
> transactions that it does it emulates a wait time of 200ms.

After re-reading about this all again at 
http://blogs.sun.com/jkshah/resource/pgcon_problems.pdf I remembered I 
wanted more info on just what Sun's iGen OLTP does anyway.  Here's a 
collection of published comments on it that assembles into a reasonably 
detailed picture, as long as you're somewhat familiar with what TPC-C 
does:

http://blogs.sun.com/bmseer/entry/t2000_adds_database_leadership_to

"The iGEN-OLTP 1.5 benchmark is a SUN internally developed transaction 
processing database workload. This workload simulates a light-weight 
Global Order System that stresses lock management and connectivity."

http://www.mysqlperformanceblog.com/2008/02/27/a-piece-of-sunmysql-marketing/#comment-246663

"The iGen workload was created from actual customer workloads and has a 
lot more complexity than Sysbench which only test very simple operations 
one at a time. The iGen database consist of 6 tables and its executes a 
combination of light, medium and heavy transactions."

http://www.sun.com/third-party/global/oracle/collateral/T2000_Oracle_iGEN_05-12-06.pdf?null

"The iGEN-OLTP benchmark is a stress and performance test, measuring the 
throughput and simultaneous user connections of an OLTP database workload. 
The iGEN-OLTP workload is based on customer applications and is 
constructed as a 2-tier orders database application where three 
transactions are executed:

  * light read-only query
  * medium read-only query
  * 'heavy' read and insert operation.

The transactions are comprised of various SQL statements: read-only 
selects, joins, update and insert operations.  iGen OLTP avoids problems 
that plague other OTLP benchmarks like TPC-C. TPC-C has problems with only 
using light-weight queries, allowing artificial data partitioning, and 
only testing a few database functions. The iGen transactions take almost 
twice the computation work compared to the TPC-C transactions."

http://blogs.sun.com/ritu/entry/mysql_benchmark_us_t2_beats

"iGen OLTP avoids problems that plague other OTLP benchmarks like TPC-C. 
In particular, it is completely random in table row selections and thus is 
difficult to use artificial optimizations. iGen OLTP stresses process and 
thread creation, process scheduling, and database commit processing...The 
transactions are comprised of various SQL transactions: read-only selects, 
joins, inserts and update operations."

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

In response to

pgsql-performance by date

Next:From: sathiya psqlDate: 2009-03-13 08:28:05
Subject: Re: Full statement logging problematic on larger machines?
Previous:From: Robert HaasDate: 2009-03-13 01:29:52
Subject: Re: Proposal of tunable fix for scalability of 8.4

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group