Re: Serializable snapshot isolation patch

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Kevin(dot)Grittner(at)wicourts(dot)gov
Subject: Re: Serializable snapshot isolation patch
Date: 2010-10-21 05:47:42
Message-ID: 1287640062.8516.598.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, 2010-10-17 at 22:53 -0700, Jeff Davis wrote:
> 2. I think there's a GiST bug (illustrating with PERIOD type):
>
> create table foo(p period);
> create index foo_idx on foo using gist (p);
> insert into foo select period(
> '2009-01-01'::timestamptz + g * '1 microsecond'::interval,
> '2009-01-01'::timestamptz + (g+1) * '1 microsecond'::interval)
> from generate_series(1,2000000) g;
>
> Session1:
> begin isolation level serializable;
> select * from foo where p && '[2009-01-01, 2009-01-01]'::period;
> insert into foo values('[2009-01-01, 2009-01-01]'::period);
>
> Session2:
> begin isolation level serializable;
> select * from foo where p && '[2009-01-01, 2009-01-01]'::period;
> insert into foo values('[2009-01-01, 2009-01-01]'::period);
> commit;
>
> Session1:
> commit;
>
> In pg_locks (didn't paste here due to formatting), it looks like the
> SIRead locks are holding locks on different pages. Can you clarify your
> design for GiST and the interaction with page-level locks? It looks like
> you're making some assumption about which pages will be visited when
> searching for conflicting values which doesn't hold true. However, that
> seems odd, because even if the value is actually inserted in one
> transaction, the other doesn't seem to find the conflict. Perhaps the
> bug is simpler than that? Or perhaps I have some kind of odd bug in
> PERIOD's gist implementation?
>
> Also, it appears to be non-deterministic, to a degree at least, so you
> may not observe the problem in the exact way that I do.
>

I have more information on this failure. Everything in GiST actually
looks fine. I modified the example slightly:

T1: begin isolation level serializable;
T2: begin isolation level serializable;
T1: select * from foo where p && '[2009-01-01, 2009-01-01]'::period;
T2: select * from foo where p && '[2009-01-01, 2009-01-01]'::period;
T2: commit;
T1: commit;

The SELECTs only look at the root and the predicate doesn't match. So
each SELECT sets an SIReadLock on block 0 and exits the search. Looks
good so far.

T1 then inserts, and it has to modify page 0, so it does
FlagRWConflict(). That sets writer->inConflict = reader and
reader->outConflict = writer (where writer is T1 and reader is T2); and
T1->outConflict and T2->inConflict remain NULL.

Then T2 inserts, and I didn't catch that part in as much detail in gdb,
but it apparently has no effect on that state, so we still have
T1->inConflict = T2, T1->outConflict = NULL, T2->inConflict = NULL, and
T2->outConflict = T1.

That looks like a reasonable state to me, but I'm not sure exactly what
the design calls for. I am guessing that the real problem is in
PreCommit_CheckForSerializationFailure(), where there are 6 conditions
that must be met for an error to be thrown. T2 falls out right away at
condition 1. T1 falls out on condition 4. I don't really understand
condition 4 at all -- can you explain it? And can you explain conditions
5 and 6 too?

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Itagaki Takahiro 2010-10-21 06:16:15 Re: UNION ALL has higher cost than inheritance
Previous Message Tom Lane 2010-10-21 05:18:05 Re: UNION ALL has higher cost than inheritance