Re: Potential GIN vacuum bug

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki <hlinnaka(at)iki(dot)fi>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Potential GIN vacuum bug
Date: 2015-08-30 22:24:58
Message-ID: CAMkU=1zk+hE-MEggw3zCrUTSQPu9c8qZiogSbhH0n3Yzmx-S+A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Aug 30, 2015 at 11:11 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
>
> Your earlier point about how the current design throttles insertions to
> keep the pending list from growing without bound seems like a bigger deal
> to worry about. I think we'd like to have some substitute for that.
> Perhaps we could make the logic in insertion be something like
>
> if (pending-list-size > threshold)
> {
> if (conditional-lock-acquire(...))
> {
> do-pending-list-cleanup;
> lock-release;
> }
> else if (pending-list-size > threshold * 2)
> {
> unconditional-lock-acquire(...);
> if (pending-list-size > threshold)
> do-pending-list-cleanup;
> lock-release;
> }
> }
>
> so that once the pending list got too big, incoming insertions would wait
> for it to be cleared. Whether to use a 2x safety margin or something else
> could be a subject for debate, of course.
>

If the goal is to not change existing behavior (like for back patching) the
margin should be 1, always wait. But we would still have to deal with the
fact that unconditional acquire attempt by the backends will cause a vacuum
to cancel itself, which is undesirable. If we define a new namespace for
this lock (like the relation extension lock has its own namespace) then
perhaps the cancellation code could be made to not cancel on that
condition. But that too seems like a lot of work to backpatch.

Would we bother to back-patch a theoretical bug which there is no evidence
is triggering in the field? Of course, if people are getting bit by this,
they probably wouldn't know. You search for "malevolent unicorns", get no
hits, and just assume there are no hits, without scouring the table and
seeing it is an index problem. Or if you do realize it is an index
problem, you would probably never trace it back to the cause of the
problem. There are quite a few reports of mysterious index corruptions
which never get resolved.

If we want to improve the current behavior rather than fix a bug, then I
think that if the list is greater than threshold*2 and the cleaning lock is
unavailable, what it should do is proceed to insert the tuple's keys into
the index itself, as if fastupdate = off. That would require some major
surgery to the existing code, as by the time it invokes the clean up, it is
too late to not insert into the pending list.

Cheers,

Jeff

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2015-08-30 22:29:43 Re: Horizontal scalability/sharding
Previous Message David Fetter 2015-08-30 21:56:58 Re: [patch] Proposal for \rotate in psql