Quick Links

Lost updates vs resumable connections/transactions

From:	Jens Lechtenbörger <lechtej(at)uni-muenster(dot)de>
To:	pgsql-interfaces(at)postgresql(dot)org
Subject:	Lost updates vs resumable connections/transactions
Date:	2004-12-09 15:43:03
Message-ID:	Pine.A41.4.58.0412091641060.68588@zivunix.uni-muenster.de
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-interfaces

Dear reader,

I've got the following simple wish: Please tell me how to program a
standard web application that allows to update tuples without lost
updates.

My guess is that this cannot be done easily right now, but that this
could be done very easily with just a slight API (libpq) extension,
which allows to tie a user/browser session to a database
transaction. The remainder of this mail explains my view of the
situation in detail. If I'm missing something fundamental then
please let me know...

I'd like to have a web application (PHP in my case) that starts a
transaction to display some tuples in a browser. (This happens in
one script in an isolated transaction/connection.)
The user should then be able to update some values of a chosen tuple
and write the updated tuple back into the database. (This happens
in a second script, which - due to the stateless nature of HTTP -
has to open a second database connection, so there is no link
between the database operations of script one and script two.)

I'd like to see this application under transaction control to avoid
race conditions and, in particular, the following kind of lost
update (which, according to Murphy, *will* happen with two
independent transactions/connections/scripts):
Two users select the same tuple t at about the same point in time,
user u1 updates t into t1 and writes it back, while user u2 updates
t into t2 and writes that version back, just a little after t1 has
been written. This way, the update of user u1 gets lost.

In my view the above scenario is that of a standard multi-user web
application. However, I couldn't find a simple solution to program
transactions spanning multiple scripts/HTTP requests.
(PHP persistent connections don't help, see, e.g., here:
http://www-fr.mysql.com/news-and-events/newsletter/2002-11/a0000000086.html
)

So, my first question: Is there a simple solution to program
transactions that span multiple scripts?
I'm aware of the following "heavyweight" solution:
1. Do some kind of explicit user level locking. In my view, this is
redundant work and the wrong way as databases support transactions.
2. Program a proxy-process that actually handles the database
connections (instead of the HTTP server) and that allows to suspend a
transaction at the end of one script, which can be resumed at the
beginning of another script. E.g. SQL Relay can do this:
http://sqlrelay.sourceforge.net/

So, my second question: What do the PostgreSQL maintainers think
about enriching the API to make it usable in a web environment,
where stateless HTTP servers stand between (and in the way of)
statefull applications and database servers?
This could be done with just two additional API calls, e.g.,
for libpq:
int PQsuspendConn(PGconn *conn) and
PGconn *PQresumeConn(const char *conninfo)

Here, I assume that PQsuspendConn gets called at the end of one
script, which tells the database that this connection and its
associated transaction will be idle for some time and returns an
identifier, say x. Later on, in some following script the database
connection is opened via PQresumeConn, passing the result of
PQsuspendConn as an additional parameter, say "transactionID = x",
which tells the database to continue the previous transaction on the
new connection. This way, one could tie an HTTP session to a single
transaction. (Function PQresumeConn could even be avoided by adding
the new parameter transactionID to PQconnectdb.)

What do you think?

Jens

P.S.
1. I'm aware that a suspended transaction could block other
transactions indefinitely. To get around that, timeout mechanisms
could be used.
2. There would be exactly one database connection per user session.
Since the vast majority of user connections are likely to be
suspended at any point in time (they are only active while a script
is being executed, not while the user is thinking), suspended
connections have to be handled "efficiently".

Responses

Re: Lost updates vs resumable connections/transactions at 2004-12-15 16:42:09 from Jan Wieck

Browse pgsql-interfaces by date

	From	Date	Subject
Next Message	miguel lam	2004-12-09 21:10:11	restore a plpgsql function with pg_restore
Previous Message	Philip Yarra	2004-12-07 05:59:22	Re: postgresql-7.4.5