From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Cc: | Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: stress test for parallel workers |
Date: | 2019-07-23 22:11:58 |
Message-ID: | 32179.1563919918@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> *I suspect that the only thing implicating parallelism in this failure
> is that parallel leaders happen to print out that message if the
> postmaster dies while they are waiting for workers; most other places
> (probably every other backend in your cluster) just quietly exit.
> That tells us something about what's happening, but on its own doesn't
> tell us that parallelism plays an important role in the failure mode.
I agree that there's little evidence implicating parallelism directly.
The reason I'm suspicious about a possible OOM kill is that parallel
queries would appear to the OOM killer to be eating more resources
than the same workload non-parallel, so that we might be at more
hazard of getting OOM'd just because of that.
A different theory is that there's some hard-to-hit bug in the
postmaster's processing of parallel workers that doesn't apply to
regular backends. I've looked for one in a desultory way but not
really focused on it.
In any case, the evidence from the buildfarm is pretty clear that
there is *some* connection. We've seen a lot of recent failures
involving "postmaster exited during a parallel transaction", while
the number of postmaster failures not involving that is epsilon.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2019-07-23 22:16:35 | Re: pgbench - allow to create partitioned tables |
Previous Message | Fabien COELHO | 2019-07-23 22:06:58 | Re: make \d pg_toast.foo show its indices ; and, \d toast show its main table ; and \d relkind=I show its partitions |