Re: odd behavior: function not atomic/not seeing it's own

From: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: odd behavior: function not atomic/not seeing it's own
Date: 2002-12-17 18:32:32
Message-ID: 1040149953.25972.289.camel@camel
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Mon, 2002-12-16 at 18:53, Tom Lane wrote:
> Robert Treat <xzilla(at)users(dot)sourceforge(dot)net> writes:
> > "this is broken on so many levels..."
>
> The issue here is that the second transaction has already set its
> snapshot when it enters the function body; so even though its LOCK
> command faithfully waits for the first transaction to commit, the
> second transaction can't see (and so can't update) the two rows
> inserted by the first transaction. It doesn't really see the update
> of the original row either, except for purposes of its own UPDATE.
>

I understand that, but what I don't understand is why does the first
function call see it's changes (as evidence by the different information
in the raise notices -- 0,10) but the second query is unable to see
either the first functions changes or its own? (raise notice returns
0,0) The fact that functions behave differently when run concurrently
seems dangerous to me. Consider rewriting the test case function as
this:

CREATE OR REPLACE FUNCTION locktest1(integer) RETURNS INTEGER AS '
DECLARE
x INTEGER :=1;
y ALIAS FOR $1;
foo INTEGER;
BEGIN
LOCK TABLE locktest IN EXCLUSIVE MODE;
INSERT INTO locktest(a) VALUES (y);
LOOP
x:= x + 1;
EXIT WHEN x >= 1000000;
END LOOP;
INSERT INTO locktest(a) VALUES (y+1);
SELECT a FROM locktest ORDER BY c ASC LIMIT 1 INTO foo;
RAISE NOTICE \'original entry in column a is %\', foo;
UPDATE locktest SET a=a+y;
SELECT a FROM locktest ORDER BY c ASC LIMIT 1 INTO foo;

IF foo > 0 THEN
UPDATE locktest SET a=a+100;
END IF;

RAISE NOTICE \'original entry in column a is now %\', foo;
return y;
END;
' language 'plpgsql';

the check for foo returns true in the first function, returns false in
the second function, which will cause the functions to behave
differently given the same logic. This seems very dangerous to me.

> It would work more like you're expecting if you'd issued the LOCK
> before calling the function. I realize that's not convenient in
> many cases.

inconvenient to say the least. one of the major reason to use functions
is the ability to capture business logic within the database so that it
can be used by multiple apps. Given the need to issue locks in the
application code, this functionality takes a major hit. (Your left
hoping the application programmers put locks in all the right places,
otherwise your data may get corrupted)

>
> > OTOH if functions ran as if they we're in serializable mode, the second
> > function would, upon attempt to update the first record, see that the
> > record was already updated, and throw a "ERROR: Can't serialize access
> > due to concurrent update", which could then be dealt with accordingly.
>
> It would do that, if you had run it in serializable mode. I get:
>
> regression=# begin;
> BEGIN
> regression=# set TRANSACTION ISOLATION LEVEL SERIALIZABLE ;
> SET
> regression=# select locktest1(20);
> NOTICE: original entry in column a is 0
> WARNING: Error occurred while executing PL/pgSQL function locktest1
> WARNING: line 15 at SQL statement
> ERROR: Can't serialize access due to concurrent update
> regression=#
>

true. however again, this work around requires code in the application,
since there is no way to set the transaction level within a function.
what we need is self-contained solution for functions.

>
> In the read-committed case, there has been some talk of executing
> SetQuerySnapshot between statements of a function; search the archives
> for "SetQuerySnapshot" to find past threads. I'm leaning in favor
> of that myself, but no one's yet given a convincing analysis of what
> this would change and what the downside might be.

I've done a bit of searching on this and I don't see any cases where the
current way setQuerySnapshot works is improving the situation. My
feeling is it seems there might be cases where functions would not work
atomically if they always saw what other peoples changes were doing.

Imagine a function that selects a value, does some computation, then
selects again. If that first value changes in the middle of the
function, it might throw off what is supposed to happen. This is a
result of the fact that some functions should work instantaneously, but
can't always do so.

>
> (A compromise less likely to break existing code might be to do
> SetQuerySnapshot only after a function issues LOCK, but still the real
> effects of this would need to be clearly understood before such a
> proposal is likely to pass.)
>

I've thought about this too, and I think it is a decent compromise. If
there is one case where your really likely to want an updated
QuerySnapshot it's when you grab a lock on a table, after all, why would
you want to lock data that you can't see? I can't think of a case where
this would break things given that right now we are blindly acquiring
locks; how can actually knowing what you've locked cause you problems?
Furthermore, this would help ensure that further queries inside the
function will be looking at a more consistent snapshot. I would lobby
that this should go in 7.3.1 as it at least gives people a viable, self
contained work around.

I think ideally what would be good would be the ability to create
functions as inherently serializable or read committed (perhaps though a
parameter passed in at function creation like iscacheable). This would
force the function to behave in one of the two modes as documented for
transaction types, and update the query snapshots accordingly. I think
the default would be serializable in order to preserve atomicity.

One last thing, and correct me if I'm wrong, but any fix would need to
happen for all of the function languages, not just plpgsql, right?

Robert Treat

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Bruno Wolff III 2002-12-18 15:14:18 COPY problem with bad dates
Previous Message bullshark 2002-12-17 18:14:40 int4out(lo) does not exist (create type)