Re: BUG #15585: infinite DynamicSharedMemoryControlLock waiting in parallel query

From: Maksim Milyutin <milyutinma(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, sk(at)zsrv(dot)org, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #15585: infinite DynamicSharedMemoryControlLock waiting in parallel query
Date: 2019-01-16 14:23:01
Message-ID: 458c691a-1b2d-7a70-1c04-58361b894011@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs


On 1/11/19 8:16 AM, Thomas Munro wrote:
> On Fri, Jan 11, 2019 at 10:04 AM Thomas Munro
> <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
>> I am trying to reproduce it.
> I ran squillions of concurrent parallel queries today, making sure
> they would allocate and free entire segments a lot (based on the
> assumption that something along those lines must be required). I made
> sure to use Linux, GCC, -O2, no asserts (since both reports came from
> that environment, and I'd previously failed to reproduce this on my
> usual tool chain/platforms), and I used a multi-socket box (in case
> some cache coherency effect was not occurring on my development
> laptops). I did learn some interesting things about parallel query
> performance that I plan to follow up on, but I had no luck in
> reproducing this error. Rats.
>
> One observation is that the other report involved a fairly simple
> Parallel Hash Join query (on 11), and this report involves Parallel
> Index Scan and Parallel Bitmap Index Scan (on 10), so that suggests
> that it's probably not a bug in the parallel executor code (for
> example an access-after-free, whose undefined behaviour could in
> theory look like this with unlucky timing, I speculate) but rather
> something lower level.

Recently one of our customers encountered the same problem on PG 11.1.
The query structure and error messages are similar to one that was
described in [1].

OS/PG settings:

Ubuntu 16.04

mem 128 GB
cpu 16 cores
huge_pages is turned off

# sudo cat /proc/sys/kernel/shmmax
18446744073692774399

max_connections = 200
shared_buffers = 33GB
dynamic_shared_memory_type = posix
max_worker_processes = 16
max_parallel_maintenance_workers = 6
max_parallel_workers_per_gather = 6
max_parallel_workers = 12

The query was:

SELECT
  ev.date,
  r.indicator_id,
  r.service_reg_id,
  count(r.value),
  count(CASE WHEN r.value::int > 3 THEN 1 ELSE NULL END),
  avg(r.value::int)
FROM base.hershel_events AS ev
  INNER JOIN base.hershel_rates AS r ON r.event_id = ev.id
WHERE ev.date between '2018-09-15' AND '2018-12-15'
GROUP BY ev.date, r.indicator_id, r.service_reg_id
ORDER BY ev.date ASC

Its plan with record statistics:

 Finalize GroupAggregate (actual rows=81332937 loops=1)
   Group Key: ev.date, r.indicator_id, r.service_reg_id
   ->  Gather Merge (actual rows=81332937 loops=1)
         Workers Planned: 7
         Workers Launched: 7
         ->  Partial GroupAggregate (actual rows=10166617 loops=8)
               Group Key: ev.date, r.indicator_id, r.service_reg_id
               ->  Sort (actual rows=10166617 loops=8)
                     Sort Key: ev.date, r.indicator_id, r.service_reg_id
                     Sort Method: external merge  Disk: 549392kB
                     Worker 0:  Sort Method: external merge Disk: 516104kB
                     Worker 1:  Sort Method: external merge Disk: 533200kB
                     Worker 2:  Sort Method: external merge Disk: 540680kB
                     Worker 3:  Sort Method: external merge Disk: 540760kB
                     Worker 4:  Sort Method: external merge Disk: 530512kB
                     Worker 5:  Sort Method: external merge Disk: 472240kB
                     Worker 6:  Sort Method: external merge Disk: 539256kB
                     ->  Parallel Hash Join (actual rows=10166617 loops=8)
                           Hash Cond: (r.event_id = ev.id)
                           ->  Parallel Seq Scan on hershel_rates r
(actual rows=15000000 loops=8)
                           ->  Parallel Hash (actual rows=9319175 loops=8)
                                 Buckets: 4194304  Batches: 32 Memory
Usage: 124064kB
                                 ->  Parallel Seq Scan on
hershel_events ev (actual rows=9319175 loops=8)
                                       Filter: ((date >=
'2018-09-15'::date) AND (date <= '2018-12-15'::date))
                                       Rows Removed by Filter: 4430825

At some moment one of parallel workers emitted to log the error message:
dsa_area could not attach to segment. After terminating all workers (the
leader also bumped into the same error) the last one began to emit to
log the following type of messages: LOG:  temporary file: path
"base/pgsql_tmp/pgsql_tmp5027.0.sharedfileset/i55of128.p1.0", size
458752. The summary size of temp files that was mentioned in the log was
about *14GB*. And eventually that last worker got the fatal error:
cannot unpin a segment that is not pinned. After that the
DynamicSharedMemoryControlLock left unreleased and soon new backends
hanged on it.

Another time some backend (more likely, it was some utility worker)
bumped into the error: could not open shared memory segment
"/PostgreSQL.1948558832": No such file or directory. After that other
backends stopped on DynamicSharedMemoryControlLock. Before this issue
there were parallel aggregate queries.

Unfortunately our customer doesn't want to catch up this error on
production system again. He turned off the query parallelism. And this
issue doesn't reproduce on test environment. AFAIK this is heisenbug and
is reproducible on heavy load. I'll try to run some stress tests to
catch up stack trace.

[1]
https://www.postgresql.org/message-id/flat/20181231221734.GB25379%40telsasoft.com
<https://www.postgresql.org/message-id/flat/20181231221734(dot)GB25379(at)telsasoft(dot)com>

--
Regards, Maksim Milyutin
Postgres Professional:http://www.postgrespro.ru
The Russian Postgres Company

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2019-01-16 14:56:34 BUG #15594: Unstable tests in contrib/test_decoding/output_iso/
Previous Message Amit Langote 2019-01-16 09:30:02 Re: BUG #15587: Partitions with ALTER TABLE ADD CONSTRAINT