Re: margay fails assertion in stats/dsa/dsm code

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Marcel Hofstetter <hofstetter(at)jomasoft(dot)ch>
Subject: Re: margay fails assertion in stats/dsa/dsm code
Date: 2022-06-29 10:17:23
Message-ID: CA+hUKGJ6+TRWex2FmgsL3LtznkwtcXrGrF23GhMK4ddqZGF9ww@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 29, 2022 at 4:00 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> I suppose this could indicate that the machine and/or RAM disk is
> overloaded/swapping and one of those open() or unlink() calls is
> taking a really long time, and that could be fixed with some system
> tuning.

Hmm, I take that bit back. Every backend that starts up is trying to
attach to the same segment, the one with the new pgstats stuff in it
(once the small space in the main shmem segment is used up and we
create a DSM segment). There's no fairness/queue, random back-off or
guarantee of progress in that librt lock code, so you can get into
lock-step with other backends retrying, and although some waiter
always gets to make progress, any given backend can lose every round
and run out of retries. Even when you're lucky and don't fail with an
undocumented incomprehensible error, it's very slow, and I'd
considering filing a bug report about that. A work-around on
PostgreSQL would be to set dynamic_shared_memory_type to mmap (= we
just open our own files and map them directly), and making pg_dynshmem
a symlink to something under /tmp (or some other RAM disk) to avoid
touch regular disk file systems.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2022-06-29 10:48:12 Re: Support logical replication of DDLs
Previous Message huyajun 2022-06-29 09:56:39 Re: Implementing Incremental View Maintenance