Re: User-facing aspects of serializable transactions

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Greg Stark" <stark(at)enterprisedb(dot)com>
Cc: "<Markus Wanner" <markus(at)bluegap(dot)ch>, "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, "Jeff Davis" <pgsql(at)j-davis(dot)com>, "<pgsql-hackers(at)postgresql(dot)org>" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: User-facing aspects of serializable transactions
Date: 2009-06-02 13:44:20
Message-ID: 4A24E664.EE98.0025.1@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greg Stark <stark(at)enterprisedb(dot)com> wrote:

> On Tue, Jun 2, 2009 at 1:13 AM, Kevin Grittner
> <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
>> Greg Stark <stark(at)enterprisedb(dot)com> wrote:
>>
>>> Just as carefully written SQL code can be written to avoid
deadlocks
>>> I would expect to be able to look at SQL code and know it's safe
>>> from serialization failures, or at least know where they might
>>> occur.
>>
>> This is the crux of our disagreement, I guess. I consider existing
>> techniques fine for situations where that's possible.
>
> a) When is that possible? Afaict it's always possible, you can never
> know and when it might happen could change any time.

Sorry that I wasn't more clear -- I meant "I consider existing
techniques fine where it's possible to look at all the SQL code and
know what's safe from serialization failures or at least know where
they might occur". I don't believe that's possible in an environment
with 8,700 queries in the application software, under constant
modification, with ad hoc queries run every day.

> b) What existing techniques, explicit locking?

Whichever techniques you would use right now, today, in PostgreSQL
which you feel are adequate to your needs. You pick.

>> But, could you
>> give me an estimate of how much time it would take you, up front
and
>> ongoing, to do that review in our environment? About 8,700 queries
>> undergoing frequent modification, by 21 programmers, for
enhancements
>> in our three-month release cycle. Plus various ad hoc queries. We
>> have one full-time person to run ad hoc data fixes and reports
>> requested by the legislature and various outside agencies, like
>> universities doing research.
>
> Even in your environment I could easily imagine, say, a monthly job
to
> delete all records older than 3 months. That job could take hours or
> even days. It would be pretty awful for it to end up needing to be
> retried. All I'm saying is that if you establish a policy -- perhaps
> enforced using views -- that no queries are allowed to access
records
> older than 3 months you shouldn't have to worry that you'll get a
> spurious serialization failure working with those records.

You have totally lost me. We have next to nothing which can be
deleted after three months. We have next to nothing which we get to
decide is deletable. The elected Clerk of Court in each county is the
custodian of the records for that county, we facilitate their
record-keeping. Some counties back-loaded data for some case types
(for example, probate) back to the beginning, in the mid-1800s, and
that information is not likely to go away any time soon. Since
they've been using the software for about 20 years now, enough cases
are purgeable under Supreme Court records retention rules that we're
just now getting around to writing purge functions, but you don't even
*want* to know how complex the rules around that are....

The three month cycle I mentioned was how often we issue a major
release of the application software. Such a release generally
involves a lot of schema changes, and changes to hundreds of queries,
but no deletion of data.

-Kevin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeremy Kerr 2009-06-02 13:52:01 Re: [RFC,PATCH] SIGPIPE masking in local socket connections
Previous Message Simon Riggs 2009-06-02 13:41:00 Re: explain analyze rows=%.0f