Quick Links

strange parallel query behavior after OOM crashes

From:	Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	strange parallel query behavior after OOM crashes
Date:	2017-03-30 18:38:55
Message-ID:	6dd5675f-ef4c-fb3c-3b0c-c2a759fd631e@2ndquadrant.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

While doing some benchmarking, I've ran into a fairly strange issue with
OOM breaking LaunchParallelWorkers() after the restart. What I see
happening is this:

1) a query is executed, and at the end of LaunchParallelWorkers we get

nworkers=8 nworkers_launched=8

2) the query does a Hash Aggregate, but ends up eating much more memory
due to n_distinct underestimate (see [1] from 2015 for details), and
gets killed by OOM

3) the server restarts, the query is executed again, but this time we
get in LaunchParallelWorkers

nworkers=8 nworkers_launched=0

There's nothing else running on the server, and there definitely should
be free parallel workers.

4) The query gets killed again, and on the next execution we get

nworkers=8 nworkers_launched=8

again, although not always. I wonder whether the exact impact depends on
OOM killing the leader or worker, for example.

regards

[1]
https://www.postgresql.org/message-id/flat/CAFWGqnsxryEevA5A_CqT3dExmTaT44mBpNTy8TWVsSVDS71QMg%40mail.gmail.com

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Responses

Re: strange parallel query behavior after OOM crashes at 2017-03-30 19:02:35 from Thomas Munro

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Simon Riggs	2017-03-30 18:40:08	Re: Logical decoding on standby
Previous Message	Mithun Cy	2017-03-30 18:36:43	Re: [POC] A better way to expand hash indexes.