Re: Lost updates vs resumable connections/transactions

From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Jens Lechtenbörger <lechtej(at)uni-muenster(dot)de>
Cc: pgsql-interfaces(at)postgresql(dot)org
Subject: Re: Lost updates vs resumable connections/transactions
Date: 2004-12-15 16:42:09
Message-ID: 41C06961.8020406@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-interfaces

On 12/9/2004 10:43 AM, Jens Lechtenbörger wrote:

> Dear reader,
>
> I've got the following simple wish: Please tell me how to program a
> standard web application that allows to update tuples without lost
> updates.
>
> My guess is that this cannot be done easily right now, but that this
> could be done very easily with just a slight API (libpq) extension,
> which allows to tie a user/browser session to a database
> transaction. The remainder of this mail explains my view of the
> situation in detail. If I'm missing something fundamental then
> please let me know...
>
> I'd like to have a web application (PHP in my case) that starts a
> transaction to display some tuples in a browser. (This happens in
> one script in an isolated transaction/connection.)
> The user should then be able to update some values of a chosen tuple
> and write the updated tuple back into the database. (This happens
> in a second script, which - due to the stateless nature of HTTP -
> has to open a second database connection, so there is no link
> between the database operations of script one and script two.)
>
> I'd like to see this application under transaction control to avoid
> race conditions and, in particular, the following kind of lost
> update (which, according to Murphy, *will* happen with two
> independent transactions/connections/scripts):
> Two users select the same tuple t at about the same point in time,
> user u1 updates t into t1 and writes it back, while user u2 updates
> t into t2 and writes that version back, just a little after t1 has
> been written. This way, the update of user u1 gets lost.
>
> In my view the above scenario is that of a standard multi-user web
> application. However, I couldn't find a simple solution to program
> transactions spanning multiple scripts/HTTP requests.
> (PHP persistent connections don't help, see, e.g., here:
> http://www-fr.mysql.com/news-and-events/newsletter/2002-11/a0000000086.html
> )
>
> So, my first question: Is there a simple solution to program
> transactions that span multiple scripts?
> I'm aware of the following "heavyweight" solution:
> 1. Do some kind of explicit user level locking. In my view, this is
> redundant work and the wrong way as databases support transactions.

You underestimate the impact of your "solution". Not only does it need
lock timeouts, you also get scaling problems. Your approach assumes that
each and every browser session will have its own, unique database session.

The same reason why your Apache web server can handle 10000 simultaneous
sessions with just 100 or so server processes is manifold true in the
case of a database. 10000 DB sessions means 10000 PostgreSQL backends.
Probably only 100 or so active at any given time, so there will be a lot
of swapping going on, unless you put some ridiculous amount of RAM into
place just to counter this design mistake.

Even applications that have statefull enduser terminals (like SAP R/3
for example) never allow an open transaction over user interaction. They
all use the "heavy" way of application level advisory locks (held in
database tables of course). In all these systems, no particular
operating system resource other than a socket and some memory is used by
any idle/thinking user session.

> 2. Program a proxy-process that actually handles the database
> connections (instead of the HTTP server) and that allows to suspend a
> transaction at the end of one script, which can be resumed at the
> beginning of another script. E.g. SQL Relay can do this:
> http://sqlrelay.sourceforge.net/
>
> So, my second question: What do the PostgreSQL maintainers think
> about enriching the API to make it usable in a web environment,
> where stateless HTTP servers stand between (and in the way of)
> statefull applications and database servers?
> This could be done with just two additional API calls, e.g.,
> for libpq:
> int PQsuspendConn(PGconn *conn) and
> PGconn *PQresumeConn(const char *conninfo)

Since in the Apache case, it isn't even guaranteed that the next request
in a session will be served by the same web server process, how exactly
do you intend to park and later hand over the open socket for that
connection. And please, in a portable way ;-)

Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #

In response to

Responses

Browse pgsql-interfaces by date

  From Date Subject
Next Message Dave Cramer 2004-12-15 17:06:46 Re: postgresql-7.4.5
Previous Message Jeroen T. Vermeulen 2004-12-13 11:22:45 Re: How to get Libpq?