Re: "ERROR: latch already owned" on gharial

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, CM Team <cm(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: "ERROR: latch already owned" on gharial
Date: 2022-05-27 14:21:51
Message-ID: 2643515.1653661311@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Fri, May 27, 2022 at 7:55 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>> Thanks. Hmm. So far it's always a parallel worker. The best idea I
>> have is to include the ID of the mystery PID in the error message and
>> see if that provides a clue next time.

> ... Even if we find a bug in PostgreSQL,
> it's likely to be a bug that only matters on systems nobody cares
> about.

That's possible, certainly. It's also possible that it's a real bug
that so far has only manifested there for (say) timing reasons.
The buildfarm is not so large that we can write off single-machine
failures as being unlikely to hit in the real world.

What I'd suggest is to promote that failure to elog(PANIC), which
would at least give us the PID and if we're lucky a stack trace.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-05-27 14:51:38 Re: PG15 beta1 sort performance regression due to Generation context change
Previous Message Robert Haas 2022-05-27 13:56:08 Re: "ERROR: latch already owned" on gharial