Re: temporal support patch

From: Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: temporal support patch
Date: 2012-08-21 20:51:57
Message-ID: 5033F4ED.9030900@archidevsys.co.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 22/08/12 02:16, Kevin Grittner wrote:
> Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
>> On Mon, 2012-08-20 at 19:32 -0500, Kevin Grittner wrote:
>>> Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>>>
>>>> This is sounding like a completely runaway spec on what should
>>>> be a simple feature.
>>>
>>> I hate to contribute to scope creep (or in this case scope
>>> screaming down the tracks at full steam), but I've been watching
>>> this with a queasy feeling about interaction with Serializable
>>> Snapshot Isolation (SSI).
>> There are all kinds of challenges here, and I'm glad you're
>> thinking about them. I alluded to some problems here:
>>
>>
> http://archives.postgresql.org/message-id/1345415312.20987.56.camel@jdavis
>> But those might be a subset of the problems you're talking about.
>>
>> It sounds like, at a high level, there are two problems:
>>
>> 1. capturing the apparent order of execution in the audit log
>> 2. assigning meaningful times to the changes that are consistent
>> with the apparent order of execution
>
> As far as I can see, transactions which execute DML at any
> transaction isolation level other than serializable can be
> considered to have occurred in commit order. Transactions which
> don't write to the database don't need to be considered as part of
> the history, at least in terms of viewing prior state. Same with
> transactions which roll back. (Now, failed transactions and reads
> might be of interest for some audit reports, but that seems to me
> like a different issue than a temporal database.)
>
> The funny bit is for a serializable transaction (TN) which commits
> after writing to the database -- you can't know the apparent order
> of execution as long as there are any serializable transactions
> active which can't see the work of TN (i.e., the transactions
> overlap). If such a transaction (TX) executes a read which
> conflicts with a TN write, TX appears to have executed first, since
> it doesn't see the work of TN, so I think the sequence number or
> timestamp for TN has to follow that for TX even though TN committed
> first. On the other hand, TX might write something that conflicts
> with a TN read, in which case TN will appear to have executed first
> and must get a sequence number or timestamp before TX.
> If there is a cycle, SSI will cancel one of the transactions
> involved, so that can't occur anywhere in the time line.
>
> So, if you want to allow serializable temporal queries, the timing
> of a read-write serializable transaction can't be locked down until
> all overlapping read-write serializable transactions complete; and
> the apparent order of execution must be based on read-write
> conflicts, which are tracked within SSI. I think that if we can
> generate a list of committed transactions in order based on this
> logic, it could feed into replication system -- hot standby as well
> as trigger-based systems. I think we could generate snapshots which
> exclude the transactions for which the order of execution has not
> yet been determined, and avoid the delays involved in other possible
> solutions.
>
> There's a lot of detail missing here in terms of what the API would
> be, and how we handle the summarization that can occur within SSI so
> that it can continue to function within bounded memory even in
> pessimal circumstances, but that's the general outline of my
> concerns and suggested solution.
>
> -Kevin
>
>
So if I understand correctly...

If there is a very long running transaction, say 1 hour, then all (or
just some? - depending) transactions that nominally start and finish
within that time, can not have definitive start times until the very
long running transaction finishes, even if they are successfully committed?

So if someone looks at the audit log they might not see all the
transactions they expect to see.

So, if I had an automatic query A which updated statistics based on on
transactions committed over the last 10 minutes, then many (all?)
transactions starting and successfully completing during the time of the
very long running transaction will never show up! Here I am envisioning
a query fired off every ten minutes looking for audit records with
timestamps within the previous ten minutes. However, if I ran a query B
looking at audit record numbers with in 10 minute intervals for a week,
but kicked off 24 hours after the week finished -- then I would see the
records I did not see in query A.

Hmm... if I am at all right, then probably best to have some suitably
worded 'government health warning' prominent in the documentation!

Cheers,
Gavin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mathieu Fenniak 2012-08-21 20:52:05 restartpoints stop generating on streaming replication slave
Previous Message David Gould 2012-08-21 20:12:54 Re: huge tlb support