Re: SKIP LOCKED DATA (work in progress)

From: Thomas Munro <munro(at)ip9(dot)org>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SKIP LOCKED DATA (work in progress)
Date: 2014-09-14 22:30:27
Message-ID: CADLWmXXss83oiYD0pn_SfQfg+yNEpPbPvgDb8w6Fh--jScSybA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12 September 2014 03:56, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> Thomas Munro wrote:
>> But to reach the case you mentioned, it would need to get past that
>> (xmax is not a valid transaction) but then the tuple would need to be
>> locked by another session before heap_lock_tuple is called a few lines
>> below. That's a race scenario that I don't believe we can create
>> using advisory lock tricks in an isolation test.
>
> Hm, are you able to reproduce it using GDB?
>
> Craig Ringer was saying elsewhere that there are other cases that are
> impossible to test reliably and was proposing addings hooks or
> something to block backends at convenient times. Not an easy problem ...

+1, I think that is a great idea.

FWIW here's some throwaway code that I used to do that:

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 79667f1..fbb3b55 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -54,6 +54,7 @@
#include "storage/lmgr.h"
#include "tcop/utility.h"
#include "utils/acl.h"
+#include "utils/builtins.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/snapmgr.h"
@@ -2029,6 +2030,20 @@ EvalPlanQualFetch(EState *estate, Relation
relation, int lockmode,
}

/*
+ * Begin wait point debugging hack...
+ * TODO: Only in a special build mode...
+ * We tell anyone waiting that we have reached
wait point #42.
+ * We wait for permission to proceed from wait
point #43.
+ */
+ elog(WARNING, "XXX reached point 42, waiting
at point 43");
+ DirectFunctionCall1(pg_advisory_unlock_int8,
Int64GetDatum(42));
+ DirectFunctionCall1(pg_advisory_lock_int8,
Int64GetDatum(43));
+ elog(WARNING, "XXX continuing after point 43");
+ /*
+ * End wait point debugging hack.
+ */
+
+ /*
* This is a live tuple, so now try to lock it.
*/
test = heap_lock_tuple(relation, &tuple,

Using the attached isolation spec, that race case is reached. Yeah,
it's crude and confusing having those three advisory locks (one to
allow an update chain to be created after s1 takes a snapshot, and the
other two so that s2 can block s1 at the right point to produce that
race case), but I found this less messy than trying to reproduce
complicated concurrency scenarios with GDB.

IMHO it would be great if there were a tidy and supported way to do
this kind of thing, perhaps with a formal notion of named wait points
which are only compiled in in special test builds, and an optional set
of extra isolation specs that use them.

>> > I attach some additional minor suggestions to your patch. Please feel
>> > free to reword comments differently if you think my wording isn't an
>> > improvements (or I've maked an english mistakes).
>>
>> Thanks, these are incorporated in the new version (also rebased).
>
> Great, thanks; I'll look at it again soon to commit, as I think we're
> done now.

Thanks!

Thomas Munro

Attachment Content-Type Size
test.spec text/x-rpm-spec 1.3 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Petr Jelinek 2014-09-14 23:38:52 Re: Sequence Access Method WIP
Previous Message Simon Riggs 2014-09-14 20:37:45 Re: Turning off HOT/Cleanup sometimes