Re: GIN data corruption bug(s) in 9.6devel

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GIN data corruption bug(s) in 9.6devel
Date: 2016-04-07 21:22:31
Message-ID: CAMkU=1zYkyw1qYQ-z47zyzdOVmvoLxSSR6ib_e6PHGM1p6jzvg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 6, 2016 at 9:52 AM, Teodor Sigaev <teodor(at)sigaev(dot)ru> wrote:

> I'm inclining to push v3.1 as one of two winners by size/performance and,
> unlike to pending lock patch, it doesn't change an internal logic of lock
> machinery.

This restricts the memory used by ordinary backends when doing the
cleanup to be work_mem. Shouldn't we let them use
maintenance_work_mem? Only one backend can be doing this clean up of a
given index at any given time, so we don't need to worry about many
concurrent allocations of maintenance_work_mem. This seems very
similar in spirit to index creation, where a backend is allowed to use
maintenance_work_mem.

Also, do we plan on backpatching this? While there are no known
instances of anyone hitting this bug in any released code, still it is
a bug. There are plenty of reports of corrupted indexes which never
get properly diagnosed, and for all we know some of them might be due
to this. On the other hand, it is a substantial behavior change,
which is scary to backpatch. (But all other proposed solutions also
have behavior changes of one kind or another.)

To summarize the bugs in the already-released code:

If a vacuum's dead list includes a tuple which is still in the pending
list, and the vacuum skips the pending list cleanup because it detects
someone else is already working on it, then the vacuum could remove
that tuple from the table, even though there is still a reference to
it in the index.

A page could be deleted from the pending list, put on the free space
map, then reused. When the linked list of pending pages is followed
onto this reused page, it will fail to recognize is as being deleted
and recycled, and bad things will happen. This is quite likely to
happen in 9.6 because of the new aggressive recycling of pages. But
I've seen no argument as to why it is impossible to hit this bug in
released code, it seems like it would be possible to hit it but just
on a very rare basis.

To summarize the behavior change:

In the released code, an inserting backend that violates the pending
list limit will try to clean the list, even if it is already being
cleaned. It won't accomplish anything useful, but will go through the
motions until eventually it runs into a page the lead cleaner has
deleted, at which point it realizes there is another cleaner and it
stops. This acts as a natural throttle to how fast insertions can
take place into an over-sized pending list.

The proposed change removes that throttle, so that inserters will
immediately see there is already a cleaner and just go back about
their business. Due to that, unthrottled backends could add to the
pending list faster than the cleaner can clean it, leading to
unbounded growth in the pending list and could cause a user backend to
becoming apparently unresponsive to the user, indefinitely. That is
scary to backpatch.

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2016-04-07 22:23:02 Re: GIN data corruption bug(s) in 9.6devel
Previous Message David G. Johnston 2016-04-07 20:59:56 Re: [patch] Proposal for \crosstabview in psql