Re: infinite loop in parallel hash joins / DSA / get_best_segment

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: infinite loop in parallel hash joins / DSA / get_best_segment
Date: 2018-09-17 09:12:36
Message-ID: CAEepm=0thg+ja5zGVa7jBy-uqyHrTqTm8HGhEOtMmigGrAqTbw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 17, 2018 at 10:42 AM Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> On Mon, Sep 17, 2018 at 10:38 AM Tomas Vondra
> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> > While performing some benchmarks on REL_11_STABLE (at 444455c2d9), I've
> > repeatedly hit an apparent infinite loop on TPC-H query 4. I don't know
> > what exactly are the triggering conditions, but the symptoms are these:
> >
> > ...
>
> Urgh. Thanks Tomas. I will investigate.

Thanks very much to Tomas for giving me access to his benchmarking
machine where this could be reproduced. Tomas was doing performance
testing with no assertions, but with a cassert built I was able to hit
an assertion failure after a while and eventually figure out what was
going wrong. The problem is that the 'segment bins' (linked lists
that group segments by the largest contiguous run of free pages) can
become corrupted when segments become completely free and are returned
to the operating system and then the same segment slot (index number)
is recycled, with the right sequence of allocations and frees and
timing. There is an LWLock that protects segment slot and bin
manipulations, but there is a kind of ABA problem where one backend
can finish up looking at the defunct former inhabitant of a slot that
another backend has recently create a new segment in. There is
handling for that in the form of freed_segment_counter, a kind of
generation/invalidation signalling, but there are a couple of paths
that fail to check it at the right times.

With the attached draft patch, Tomas's benchmark script runs happily
for long periods. A bit more study required with fresh eyes,
tomorrow.

--
Thomas Munro
http://www.enterprisedb.com

Attachment Content-Type Size
fix-dsa-segment-free-bug.patch application/octet-stream 2.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2018-09-17 09:15:14 Re: Speeding up INSERTs and UPDATEs to partitioned tables
Previous Message Andrey Lepikhov 2018-09-17 06:51:48 Re: XMLNAMESPACES (was Re: Clarification of nodeToString() use cases)