Re: MultiXact\SLRU buffers configuration

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MultiXact\SLRU buffers configuration
Date: 2020-11-10 18:07:07
Message-ID: 35862787-8b4d-a290-789e-6e12dc6527e8@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/10/20 7:16 AM, Andrey Borodin wrote:
>
>
>> 10 нояб. 2020 г., в 05:13, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> написал(а):
>> After the issue reported in [1] got fixed, I've restarted the multi-xact
>> stress test, hoping to reproduce the issue. But so far no luck :-(
>
>
> Tomas, many thanks for looking into this. I figured out that to make multixact sets bigger transactions must hang for a while and lock large set of tuples. But not continuous range to avoid locking on buffer_content.
> I did not manage to implement this via pgbench, that's why I was trying to hack on separate go program. But, essentially, no luck either.
> I was observing something resemblant though
>
> пятница, 8 мая 2020 г. 15:08:37 (every 1s)
>
> pid | wait_event | wait_event_type | state | query
> -------+----------------------------+-----------------+--------+----------------------------------------------------
> 41344 | ClientRead | Client | idle | insert into t1 select generate_series(1,1000000,1)
> 41375 | MultiXactOffsetControlLock | LWLock | active | select * from t1 where i = ANY ($1) for share
> 41377 | MultiXactOffsetControlLock | LWLock | active | select * from t1 where i = ANY ($1) for share
> 41378 | | | active | select * from t1 where i = ANY ($1) for share
> 41379 | MultiXactOffsetControlLock | LWLock | active | select * from t1 where i = ANY ($1) for share
> 41381 | | | active | select * from t1 where i = ANY ($1) for share
> 41383 | MultiXactOffsetControlLock | LWLock | active | select * from t1 where i = ANY ($1) for share
> 41385 | MultiXactOffsetControlLock | LWLock | active | select * from t1 where i = ANY ($1) for share
> (8 rows)
>
> but this picture was not stable.
>

Seems we haven't made much progress in reproducing the issue :-( I guess
we'll need to know more about the machine where this happens. Is there
anything special about the hardware/config? Are you monitoring size of
the pg_multixact directory?

> How do you collect wait events for aggregation? just insert into some table with cron?
>

No, I have a simple shell script (attached) sampling data from
pg_stat_activity regularly. Then I load it into a table and aggregate to
get a summary.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
collect-wait-events.sh application/x-shellscript 234 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-11-10 18:15:03 Re: Windows regress fails (latest HEAD)
Previous Message Tomas Vondra 2020-11-10 17:42:05 Re: Windows regress fails (latest HEAD)