Re: Suspending SELECTs

From: Alessandro Baretta <a(dot)baretta(at)barettadeit(dot)com>
To: August Zajonc <augustz(at)augustz(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Suspending SELECTs
Date: 2006-01-23 10:31:23
Message-ID: 43D4B07B.5090200@barettadeit.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

August Zajonc wrote:
> Alessandro Baretta wrote:
>
> Alessandro,
>
> I've very much enjoyed reading your thoughts and the problem your facing
> and everyone's responses.

Thank you for your interest, Agust.

> Since you control the middle layer, could you not use a cookie to keep a
> cursor open on the middle layer and tie into it on subsequent queries?

I do. The AS/Xcaml calls it "session key". It is usually passed in a cookie for
websites and elsewhere--query string or url--for Intranet/Extranet web
applications. The session information kept by the AS/Xcaml virtual machine
includes cached query results and state information for servlets. I
could--although not terribly easily--also use the Xcaml session manager to
handle allocation of DB connections from the pool, thus allocating one
connection per active session. The session garbage collector would then also
have to manage the recycling of stale DB connections.

> If you are concerned with too many connections open, you could timeout
> the sessions quickly and recreate the cursor if someone came back. If
> they waited 5 minutes to make the next query, certainly they could wait
> a few extra seconds to offset and reacquire a cursor?

Yes I could. Actually, there are quite a few means of handling this issue. The
possible strategies for garbage collecting resources allocated to a remote peer
is are collectively called "failure-detection algorithms" in the theory of
distributed computing. In most cases an "eventually weak failure detector" is
necessary and sufficient to guarantee a number of desirable properties in
asynchronous systems: termination of execution, bounded open connections, and
others.

Yet, it is a theorm that no asynchronous algorithm can be devised to implement
an eventually weak failure detector. This, in turn, implies that no distributed
asynchronous system--i.e. a web application--possesses the above mentioned
desirable properties. Hence, from a theoretical standpoint, we can either choose
to relax the axioms of the system allowing synchronicity--a failure detector
based on a timeout explicitly requires the notion of time--or, as I would
prefer, by eliminating the need for termination of execution--i.e. explicit
client logout--and bounded open connections by delegating to the client the
responsibility of maintaing all relevant state information. Under these
assumptions we can live happily in a perfectly asynchronous stateless world like
that of HTTP.

Now, neither of the two solutions is perfect. In an Intranet/Extranet context, I
want to store server side a significant amount of state information, including
cached query results, thus entailing the need for a synchronous
failure-detector--let's call it "implicit logout detector"--to garbage collect
the stale session files generated by the application. In an open website--no
login--I do not usually want to use sessions, so I prefer to implement the
application so that all relevant state informatoin is maintained by the client.
This model is perfect until we reach the issue of "suspending SELECTs", that is,
limiting the the cardinality of the record set to a configurable "page-size",
allowing the user to page through a vast query result.

> The hitlist idea was also a nice one if the size of the data returned is
> not overwhelming and does not need to track the underlying db at all
> (ie, no refresh).

In an open website, immediate refresh is not critical, so long as I can
guarantee some decent property of data consistency. Full consistency cannot be
achieved, as Tom pointed out. I cordially disagree with Tom on the commandment
that "Thou shalt have no property of consistency other than me". Just like we
have two different transaction isolation levels, guarateeing different degrees
of ACIDity, we could, conceivably wish to formalize a weaker notion of
consistency and implement functionality to match with it, which would not be
possible under the stronger definition property.

> Mark had a number of good general suggestions though, and I'd like to
> echo the materialized views as an option that I could see a lot of uses
> for (and have worked around in the past with SELECT INTO's and like).

I already use materialized views. The database design layer of the AS/Xcaml
allows the definition of fragmented materialized views: the view is split in
fragments, that is, equivalence classes of the record set with respect to the
operation of projection of the view signature to a (small) subset of its
columns. Yet, this actually makes the original problem worse, for materialiazed
view fragments must be garbage collected at some point, thus offering much of
the same conceptual difficulties as cursor pooling strategy.

Alex

--
*********************************************************************
http://www.barettadeit.com/
Baretta DE&IT
A division of Baretta SRL

tel. +39 02 370 111 55
fax. +39 02 370 111 54

Our technology:

The Application System/Xcaml (AS/Xcaml)
<http://www.asxcaml.org/>

The FreerP Project
<http://www.freerp.org/>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Franklin Haut 2006-01-23 10:46:27 ENC: RES: pg_dump slow - Solution
Previous Message Marcos 2006-01-23 08:14:14 Re: [PERFORMANCE] Stored Procedures