Re: Performance query about large tables, lots of concurrent access

From: Karl Wright <kwright(at)metacarta(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance query about large tables, lots of concurrent access
Date: 2007-06-18 19:59:30
Message-ID: 4676E422.8090406@metacarta.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Karl Wright wrote:
> Hi,
>
> I have an application which really exercises the performance of
> postgresql in a major way, and I am running into a performance
> bottleneck with Postgresql 8.1 that I do not yet understand.
>
> Here are the details:
>
> - There is a primary table, with some secondary tables
> - The principle transaction consists of a "SELECT...FOR UPDATE",
> followed by either an INSERT or an UPDATE on the primary table
> - INSERTs, DELETEs, and UPDATEs may occur on the secondary table
> depending on what happens with the primary table, for any given
> transaction. The secondary table has about 10x the number of rows as
> the primary.
> - All operations are carefully chosen so that highly discriminatory
> indexes are used to locate the record(s) in question. The execution
> plans show INDEX SCAN operations being done in all cases.
> - At any given time, there are up to 100 of these operations going on at
> once against the same database.
>
> What I am seeing:
>
> - In postgresql 7.4, the table activity seems to be gated by locks, and
> runs rather slowly except when the sizes of the tables are small.
> - In postgresql 8.1, locks do not seem to be an issue, and the activity
> runs about 10x faster than for postgresql 7.4.
> - For EITHER database version, the scaling behavior is not the log(n)
> behavior I'd expect (where n is the number of rows in the table), but
> much more like linear performance. That is, as the tables grow,
> performance drops off precipitously. For a primary table size up to
> 100,000 rows or so, I get somewhere around 700 transactions per minute,
> on average. Between 100,000 and 1,000,000 rows I got some 150
> transactions per minute. At about 1,500,000 rows I get about 40
> transactions per minute.
> - Access to a row in the secondary table (which right now has 13,000,000
> rows in it) via an index that has extremely good discriminatory ability
> on a busy machine takes about 90 seconds elapsed time at the moment -
> which I feel is pretty high.
>
> I tried increasing the shared_buffers parameter to see if it had any
> impact on overall throughput. It was moderately helpful going from the
> small default value up to 8192, but less helpful when I increased it
> beyond that. Currently I have it set to 131072.
>
> Question: Does anyone have any idea what bottleneck I am hitting? An
> index's performance should in theory scale as the log of the number of
> rows - what am I missing here?
>
> Thanks very much!
> Karl
>

I suppose I should also have noted that the postgresql processes that
are dealing with the transactions seem to be CPU bound. Here's a "top"
from the running system:

top - 15:58:50 up 4 days, 4:45, 1 user, load average: 17.14, 21.05, 22.46
Tasks: 194 total, 15 running, 177 sleeping, 0 stopped, 2 zombie
Cpu(s): 98.4% us, 1.5% sy, 0.0% ni, 0.0% id, 0.1% wa, 0.0% hi, 0.0% si
Mem: 16634256k total, 16280244k used, 354012k free, 144560k buffers
Swap: 8008360k total, 56k used, 8008304k free, 15071968k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
15966 postgres 18 0 1052m 1.0g 1.0g R 66.5 6.3 0:18.64 postmaster
14683 postgres 17 0 1053m 1.0g 1.0g R 54.9 6.3 0:17.90 postmaster
17050 postgres 15 0 1052m 93m 90m S 50.3 0.6 0:06.42 postmaster
16816 postgres 18 0 1052m 166m 162m R 46.3 1.0 0:04.80 postmaster
16697 postgres 18 0 1052m 992m 988m R 42.3 6.1 0:15.49 postmaster
17272 postgres 16 0 1053m 277m 273m S 30.8 1.7 0:09.91 postmaster
16659 postgres 16 0 1052m 217m 213m R 29.8 1.3 0:06.60 postmaster
15509 postgres 18 0 1052m 1.0g 1.0g R 23.2 6.4 0:26.72 postmaster
16329 postgres 18 0 1052m 195m 191m R 16.9 1.2 0:05.54 postmaster
14019 postgres 20 0 1052m 986m 983m R 16.5 6.1 0:16.50 postmaster
17002 postgres 18 0 1052m 38m 35m R 12.6 0.2 0:02.98 postmaster
16960 postgres 15 0 1053m 453m 449m S 3.3 2.8 0:10.39 postmaster
16421 postgres 15 0 1053m 1.0g 1.0g S 2.3 6.2 0:23.59 postmaster
13588 postgres 15 0 1052m 1.0g 1.0g D 0.3 6.4 0:47.89 postmaster
24708 root 15 0 2268 1136 836 R 0.3 0.0 0:05.92 top
1 root 15 0 1584 520 452 S 0.0 0.0 0:02.08 init

Karl

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Rodrigo De León 2007-06-18 21:17:59 Re: Postgres VS Oracle
Previous Message Karl Wright 2007-06-18 19:30:31 Performance query about large tables, lots of concurrent access