Re: Minmax indexes

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Nicolas Barbier <nicolas(dot)barbier(at)gmail(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, "Josh Berkus" <josh(at)agliodbs(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, "Pg Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Minmax indexes
Date: 2014-08-08 09:01:44
Message-ID: 53E491F8.3080004@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I think there's a race condition in mminsert, if two backends insert a
tuple to the same heap page range concurrently. mminsert does this:

1. Fetch the MMtuple for the page range
2. Check if any of the stored datums need updating
3. Unlock the page.
4. Lock the page again in exclusive mode.
5. Update the tuple.

It's possible that two backends arrive at phase 3 at the same time, with
different values. For example, backend A wants to update the minimum to
contain 10, and and backend B wants to update it to 5. Now, if backend B
gets to update the tuple first, to 5, backend A will update the tuple to
10 when it gets the lock, which is wrong.

The simplest solution would be to get the buffer lock in exclusive mode
to begin with, so that you don't need to release it between steps 2 and
5. That might be a significant hit on concurrency, though, when most of
the insertions don't in fact have to update the value. Another idea is
to re-check the updated values after acquiring the lock in exclusive
mode, to see if they match the previous values.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2014-08-08 09:51:00 Re: inherit support for foreign tables
Previous Message Heikki Linnakangas 2014-08-08 08:29:00 Re: Minmax indexes