Bug in 9.6 tuplesort batch memory growth logic

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Bug in 9.6 tuplesort batch memory growth logic
Date: 2016-09-05 21:54:54
Message-ID: CAM3SWZRVkuUB68DbAkgw=532gW0f+fofKueAMsY7hVYi68MuYQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

While working on my parallel CREATE INDEX patch, I came across a
problem. I initially assumed that what I saw was just a bug in my
development branch. During a final merge in a parallel worker, with
very little maintenance_work_mem, workers attempted to allocate an
amount of memory slightly less than 2 ^ 64, and fail, since that
exceeds MaxAllocHugeSize. I noticed that this happened immediately
following batchmemtuples() leaving us in a LACKMEM() state due to a
fault in its calculation (or, if you prefer, the lack of any defense
against that fault).

Further analysis led me to believe that this is a 9.6 bug. We should
have a check for sane memtuples growth, in line with the
grow_memtuples() check, where "We need to be sure that we do not cause
LACKMEM to become true, else the space management algorithm will go
nuts". Because of the way grow_memtuples() is called by
batchmemtuples() in practice (in particular, because
availMemLessRefund may be < 0 in these corner cases),
grow_memtuples()'s own protections may not save us as previously
assumed. FREEMEM() *takes away* memory from the availMem budget when
"availMemLessRefund < 0", just after the conventional grow_memtuples()
checks run.

Attached patch adds a defense. FWIW, there is no reason to think that
more careful accounting could fix this problem, since in general a
LACKMEM() condition may not immediately lead to tuples being dumped
out.

I haven't been able to reproduce this on 9.6, but that seems
unnecessary. I can reproduce it with my parallel CREATE INDEX patch
applied, with just the right test case and right number of workers
(it's rather delicate). After careful consideration, I can think of no
reason why 9.6 would be unaffected.

--
Peter Geoghegan

Attachment Content-Type Size
0001-Defend-against-faulty-batch-memtuples-calculation.patch text/x-patch 2.5 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-09-05 22:05:38 Re: Showing parallel status in \df+
Previous Message Steve Singer 2016-09-05 21:35:32 Re: Logical Replication WIP