Re: margay fails assertion in stats/dsa/dsm code

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Marcel Hofstetter <hofstetter(at)jomasoft(dot)ch>
Subject: Re: margay fails assertion in stats/dsa/dsm code
Date: 2022-07-01 23:10:07
Message-ID: CA+hUKGJ31Wce6HJ7xnVTKWjFUWQZPBngxfJVx4q0E98pDr3kAw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jul 2, 2022 at 1:15 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Changing the default on certain platforms to 'posix' or 'sysv'
> according to what works best on that platform seems reasonable to me.

Ok, I'm going to make that change in 15 + master.

> I agree that defaulting to 'mmap' doesn't seem like a lot of fun,
> although I think it could be a reasonable choice on a platform where
> everything else is broken. You could alternatively try to fix 'posix'
> by adding some kind of code to work around that platform's
> deficiencies. Insert handwaving here.

I don't think that 'posix' mode is salvageable on Solaris, but a new
GUC to control where 'mmap' mode puts its files would be nice. Then
you could set it to '/tmp' (or some other RAM disk), and you'd have
the same end result as shm_open() on that platform, without the lock
problem. Perhaps someone could propose a patch for 16.

As for the commit I already made, we can now see the new error:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=margay&dt=2022-07-01%2016%3A00%3A07

2022-07-01 18:25:25.848 CEST [27784:1] ERROR: could not open shared
memory segment "/PostgreSQL.499018794": File exists

Unfortunately this particular run crashed anyway, for a new reason:
one backend didn't like the state the new error left the dshash in,
during shmem_exit:

2022-07-01 18:25:25.848 CEST [27738:21] pg_regress/prepared_xacts
ERROR: could not open shared memory segment "/PostgreSQL.499018794":
File exists
2022-07-01 18:25:25.848 CEST [27738:22] pg_regress/prepared_xacts
STATEMENT: SELECT * FROM pxtest1;
TRAP: FailedAssertion("!hash_table->find_locked", File: "dshash.c",
Line: 312, PID: 27784)
/home/marcel/build-farm-14/buildroot/HEAD/pgsql.build/tmp_install/home/marcel/build-farm-14/buildroot/HEAD/inst/bin/postgres'ExceptionalCondition+0x64
[0x1008bb8b0]
/home/marcel/build-farm-14/buildroot/HEAD/pgsql.build/tmp_install/home/marcel/build-farm-14/buildroot/HEAD/inst/bin/postgres'dshash_detach+0x48
[0x10058674c]
/home/marcel/build-farm-14/buildroot/HEAD/pgsql.build/tmp_install/home/marcel/build-farm-14/buildroot/HEAD/inst/bin/postgres'pgstat_detach_shmem+0x68
[0x10075e630]
/home/marcel/build-farm-14/buildroot/HEAD/pgsql.build/tmp_install/home/marcel/build-farm-14/buildroot/HEAD/inst/bin/postgres'pgstat_shutdown_hook+0x94
[0x10075989c]
/home/marcel/build-farm-14/buildroot/HEAD/pgsql.build/tmp_install/home/marcel/build-farm-14/buildroot/HEAD/inst/bin/postgres'shmem_exit+0x84
[0x100701198]
/home/marcel/build-farm-14/buildroot/HEAD/pgsql.build/tmp_install/home/marcel/build-farm-14/buildroot/HEAD/inst/bin/postgres'proc_exit_prepare+0x88
[0x100701394]
/home/marcel/build-farm-14/buildroot/HEAD/pgsql.build/tmp_install/home/marcel/build-farm-14/buildroot/HEAD/inst/bin/postgres'proc_exit+0x4
[0x10070148c]
/home/marcel/build-farm-14/buildroot/HEAD/pgsql.build/tmp_install/home/marcel/build-farm-14/buildroot/HEAD/inst/bin/postgres'StartBackgroundWorker+0x150
[0x10066957c]
/home/marcel/build-farm-14/buildroot/HEAD/pgsql.build/tmp_install/home/marcel/build-farm-14/buildroot/HEAD/inst/bin/postgres'maybe_start_bgworkers+0x604
[0x1006717ec]
/home/marcel/build-farm-14/buildroot/HEAD/pgsql.build/tmp_install/home/marcel/build-farm-14/buildroot/HEAD/inst/bin/postgres'sigusr1_handler+0x190
[0x100672510]

So that's an exception safety problem in dshash or pgstat's new usage
thereof, which is arguably independent of Solaris and probably
deserves a new thread. You don't need Solaris to see it, you can just
add in some random fault injection.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2022-07-01 23:14:13 pg15b2: large objects lost on upgrade
Previous Message Nathan Bossart 2022-07-01 23:00:27 Re: Patch proposal: New hooks in the connection path