Re: CSStorm occurred again by postgreSQL8.2

From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Cc: Katsuhiko Okano <okano(dot)katsuhiko(at)oss(dot)ntt(dot)co(dot)jp>
Subject: Re: CSStorm occurred again by postgreSQL8.2
Date: 2006-08-07 04:36:56
Message-ID: 20060807114525.5266.ITAGAKI.TAKAHIRO@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches


Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > It does not solve, even if it increases the number of NUM_SUBTRANS_BUFFERS.
> > The problem was only postponed.
>
> Can you provide a reproducible test case for this?

This is the reproducible test case:
- Occurs on both 8.1.4 and HEAD.
- On smp machine. I used dual opterons.
CSStrom becomes worse on dual xeon with hyper-threading.
- Tuning parameters are default. Whole data are cached in shared buffers.
(shared_buffers=32MB, data of pgbench (scale=1) are less than 15MB.)
- Using custom pgbench. One client doing UPDATE with indexscan
and multiple clients doing SELECT with seqscan/indexscan.

$ pgbench -i
$ pgbench -n -c 1 -t 100000 -f cs_update.sql &
$ pgbench -n -c 50 -t 100000 -f cs_indexscan.sql &
$ pgbench -n -c 35 -t 100000 -f cs_seqscan.sql &
(The scripts are attached at end of this message.)

In above workload, context switches are 2000-10000/sec and cpu usage is
user=100%. Then, start a long open transaction on another connection.

$ psql
# begin; -- Long open transaction

After a lapse of 30-60 seconds, context switches become 50000/sec over
(120000 over on xeons) and cpu usage is user=66% / sys=21% / idle=13%.
If we increase the frequency of UPDATE, the duration becomes shorter.

This is a human-induced workload, but I can see the same condition in
TPC-W -- even though it is a benchmark. TPC-W requires full-text search
and it is implementd using "LIKE %foo%" in my implementation (DBT-1, too).
Also, it requires periodical aggregations. They might behave as long
transactions.

The cause seems to be a lock contention. The number of locks on
SubtransControlLock and SubTransBuffer are significantly increased
by comparison with BufMappingLocks.

# Before starting a long transaction.
kind | lwlock | sh_call | sh_wait | ex_call | ex_wait
------+---------------------+----------+---------+---------+---------
13 | SubtransControlLock | 28716 | 2 | 54 | 0
22 | BufMappingLock | 11637884 | 0 | 2492 | 0
27 | SubTransBuffer | 0 | 0 | 11 | 0

# After
kind | lwlock | sh_call | sh_wait | ex_call | ex_wait
------+---------------------+----------+---------+---------+---------
13 | SubtransControlLock | 4139111 | 65059 | 3926691 | 390838
22 | BufMappingLock | 32348073 | 0 | 2509 | 0
27 | SubTransBuffer | 939646 | 960341 | 1419152 | 61

The invokers of SubTrans module are two SubTransGetTopmostTransaction()
in HeapTupleSatisfiesSnapshot(). When I disabled the calls, CSStorm did
not occur. SubTransGetTopmostTransaction returns the argument without
change when we don't use SAVEPOINTs.

If we optimize for non-subtransactions, we can avoid to lock SubTrans
for check visiblities of tuples inserted by top transactions.
If we want to resolve the probmen fundamentally, we might have to
improve SubTrans using a better buffer management algorithm or so.

Do you have any idea to avoid such a problem?

-- cs_update.sql
\set naccounts 100000 * :tps
\setrandom aid 1 :naccounts
\setrandom delta -5000 5000
UPDATE accounts SET abalance = abalance + :delta WHERE aid = :aid;
SELECT pg_sleep(0.1);

-- cs_seqscan.sql
\set naccounts 100000 * :tps
\setrandom aid 1 :naccounts
SELECT abalance FROM accounts WHERE aid::int8 = :aid; -- cast to force seqscan

-- cs_indexscan.sql
\set naccounts 100000 * :tps
\setrandom aid 1 :naccounts
SELECT abalance FROM accounts WHERE aid = :aid;

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joshua D. Drake 2006-08-07 07:33:56 Re: 8.2 features status
Previous Message Tom Lane 2006-08-07 03:53:43 Re: 'startup waiting' status message

Browse pgsql-patches by date

  From Date Subject
Next Message Hiroshi Saito 2006-08-07 07:52:21 Re: Fixed definition of complicated errcode of ms_vc8
Previous Message Tom Lane 2006-08-07 03:50:06 Re: LWLock statistics collector