Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)

From: Andres Freund <andres(at)anarazel(dot)de>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)
Date: 2023-01-30 05:36:50
Message-ID: 20230130053650.qlwcadapke2xot2c@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-01-30 15:22:34 +1300, Thomas Munro wrote:
> On Mon, Jan 30, 2023 at 6:26 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> > out-of-order hazard
>
> I've been trying to understand how that could happen, but my CPU-fu is
> weak. Let me try to write an argument for why it can't happen, so
> that later I can look back at how stupid and naive I was. We have A
> B, and if the CPU sees no dependency and decides to execute B A
> (pipelined), shouldn't an interrupt either wait for the whole
> schemozzle to commit first (if not in a hurry), or nuke it, handle the
> IPI and restart, or something?

In a core local view, yes, I think so. But I don't think that's how it can
work on multi-core, and even more so, multi-socket machines. Imagine how it'd
influence latency if every interrupt on any CPU would prevent all out-of-order
execution on any CPU.

> After an hour of reviewing randoma
> slides from classes on out-of-order execution and reorder buffers and
> the like, I think the term for making sure that interrupts run with
> the illusion of in-order execution maintained is called "precise
> interrupts", and it is expected in all modern architectures, after the
> early OoO pioneers lost their minds trying to program without it. I
> guess generally you want that because it would otherwise run your
> interrupt handler in a completely uncertain environment, and
> specifically in this case it would reach our signal handler which
> reads A's output (waiting) and writes to B's input (is_set), so B IPI
> A surely shouldn't be allowed?

Userspace signals aren't delivered synchronously during hardware interrupts
afaik - and I don't think they even possibly could be (after all the process
possibly isn't scheduled).

I think what you're talking about with precise interrupts above is purely
about the single-core view, and mostly about hardware interrupts for faults
etc. The CPU will unwind state from speculatively executed code etc on
interrupt, sure - but I think that's separate from guaranteeing that you can't
have stale cache contents *due to work by another CPU*.

I'm not even sure that userspace signals are generally delivered via an
immediate hardware interrupt, or whether they're processed at the next
scheduler tick. After all, we know that multiple signals are coalesced, which
certainly isn't compatible with synchronous execution. But it could be that
that just happens when the target of a signal is not currently scheduled.

> Maybe it's a much dumber sort of a concurrency problem: stale cache
> line due to missing barrier, but... commit db0f6cad488 made us also
> set our own latch (a second time) when someone sets our latch in
> releases 9.something to 13.

But this part does indeed put a crimp on some potential theories.

TBH, I'd be in favor of just adding the barriers for good measure, even if we
don't know if it's a live bug today - it seems incredibly fragile.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2023-01-30 05:55:23 Re: An attempt to avoid locally-committed-but-not-replicated-to-standby-transactions in synchronous replication
Previous Message John Naylor 2023-01-30 05:31:47 Re: Todo: Teach planner to evaluate multiple windows in the optimal order