Re: Status of issue 4593

From: "Lee McKeeman" <lmckeeman(at)opushealthcare(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Status of issue 4593
Date: 2009-01-05 21:25:38
Message-ID: CB20429AE660CB43A946BF9D61C9A2B60D5865@ohsmail.opushealthcare.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Tom,

We don't actually select * without a where clause in our actual use
case, I just wrote as concise a test case as I thought I could to
demonstrate the behavior. We have a where clause that limits the rows
that are locked (otherwise we could just do a table lock rather than
using row-level locking). In our actual case, the order by uses a
function that generates ordinality based on some external values. If we
were to use your function suggestion, it could accept two arguments that
would be used in the where clause, and the select ... for update in the
function would not need to do any ordering (there is no limit involved
here), then when selecting from the function the order by could be
applied (I think). It's something we could work with.

Right now we are selecting twice to work around this, because once we
have the rows locked in our transaction, the order (as far as I can
imagine) should not change after the lock is acquired. This is somewhat
inefficient, but the where clause uses two indexed columns, so it isn't
terrible. The initial select ... for update at this point is not using
an order, so it is a bit faster, then the second does use the order and
should always be correct since the rows are locked.

Based on your explanation I understand the reason for this behavior
which was what I and my colleagues had guessed, but you didn't assert
that this is expected/correct behavior. Is it?

Ultimately, if we were to use your function suggestion, we would have
something like:

create function mytablefunction(integer,integer) returns setof mytable
as
$$ select * from mytable where col1 > $1 and col2 = $2 for update $$
language sql;

select * from mytablefunction(10,1000) order by sortfunc(col3);

I don't know if, in the select, a column from the resulting rowset can
be used in a function in the order by clause. I've never tried it
before, but don't see why it wouldn't work.

This seems workable, and wouldn't add a terrible amount of complexity.

Also, it is certainly valid to do the sort in our app, but we had simply
come to depend on ORDER BY, and built up the necessary infrastructure in
the database(functions, etc.) to do all the ordering there. We certainly
could pull the sorting into the app, it would just be much uglier than
using the database.

Thanks,
-Lee

-----Original Message-----
From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
Sent: Monday, January 05, 2009 2:42 PM
To: Lee McKeeman
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: [BUGS] Status of issue 4593

"Lee McKeeman" <lmckeeman(at)opushealthcare(dot)com> writes:
> Description: order by is not honored after select ... for
update

The reason for this behavior is that SELECT FOR UPDATE substitutes the
latest version of the row at the time the row lock is acquired, which is
the very last step after the selection and ordering have been done.
In your example, what the ORDER BY sees is 20/25/30/300, and it sorts on
that basis, and then the 20/1 row is discovered not to be live anymore
so the 40/1 row is locked and substituted.

The only way to avoid this would be to lock before the sort, which could
have the effect of locking more rows than are returned (if you also use
LIMIT); or to repeat the sort operation after locking the rows, which I
doubt anyone is going to want it to do. I suggest sorting on the client
side if you really need this to work in this particular way.

[ thinks for awhile... ] Actually you could make it work entirely on
the server if you were willing to interpose a SQL function, along the
lines of

create function foo () returns setof test as
$$ select * from test order by value for update $$
language sql;

select * from foo() order by value;

which would accomplish the desired result of having two levels of sort.
(You might or might not need the ORDER BY inside the function --- does
your real case use ORDER BY/LIMIT, or does it really lock every row
of the table?)

regards, tom lane

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Jeff Davis 2009-01-05 22:58:18 Re: Status of issue 4593
Previous Message Tom Lane 2009-01-05 20:42:01 Re: Status of issue 4593

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-01-05 21:31:04 Re: EmitWarningsOnPlaceholders is too quiet
Previous Message Joshua D. Drake 2009-01-05 21:02:16 Re: QuickLZ compression algorithm (Re: Inclusion in the PostgreSQL backend for toasting rows)