Re: Commits 8de72b and 5457a1 (COPY FREEZE)

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Commits 8de72b and 5457a1 (COPY FREEZE)
Date: 2012-12-04 20:38:00
Message-ID: 1354653480.4530.47.camel@sussancws0025
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2012-12-04 at 10:15 +0000, Simon Riggs wrote:
> On 4 December 2012 01:34, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> >
> I assume that refers to the discussion here:
> >
> > http://archives.postgresql.org/pgsql-hackers/2012-02/msg01177.php
> >
> > But that was quite a while ago, so is there a more recent discussion
> > that prompted this commit now?
>
> Yes, this was discussed within the last month, thread "TRUNCATE SERIALIZABLE..."
>
> The patch for that was already posted, so I committed it.

Thank you for pointing me toward that thread.

While experimenting with COPY FREEZE, I noticed something:

The simple case of BEGIN; CREATE TABLE ...; COPY ... WITH (FREEZE);
doesn't meet the pre-conditions. It only meets the conditions if
preceded by a TRUNCATE, which all of the tests do. I looked into it, and
I think the test:

... &&
cstate->rel->rd_newRelfilenodeSubid == GetCurrentSubTransactionId()

should be:

... &&
(cstate->rel->rd_newRelfilenodeSubid == GetCurrentSubTransactionId() ||
cstate->rel->rd_createSubid == GetCurrentSubTransactionId())

(haven't tested). Another option to consider is that
rd_newRelfilenodeSubid could be set on newly-created tables as well as
newly-truncated tables.

Also, I recommend a hint along with the NOTICE when the pre-conditions
aren't met to tell the user what they need to do. Perhaps something
like:

"Create or truncate the table in the same transaction as COPY
FREEZE."

The documentation could expand on that, clarifying that a TRUNCATE in
the same transaction (perhaps even ALTER?) allows a COPY FREEZE.

The code comment could be improved a little, while we're at it:

"Note that the stronger test of exactly which subtransaction created
it is crucial for correctness of this optimisation."

is a little vague, can you explain using the reasoning from the thread?
I would say something like:

"The transaction and subtransaction creating or truncating the table
must match that of the COPY FREEZE. Otherwise, it could mix tuples from
different transactions together, and it would be impossible to
distinguish them for the purposes of atomicity."

> > I am a little confused about the case where setting HEAP_XMIN_COMMITTED
> > when loading the table in the same transaction is wrong. There was some
> > discussion about subtransactions, but those problems only seemed to
> > appear when the CREATE and the INSERT/COPY happened in different
> > subtransactions.
> >
> > So, I guess my question is, why the partial revert?
>
> Well, first, because I realised that it wasn't wanted. I was
> re-reading the threads trying to figure out a way to help the checksum
> patch further.
>
> Second, because I realised why it wasn't wanted. Setting
> HEAP_XMIN_COMMITTED causes MVCC violations within the transaction
> issuing the COPY.

After reading that thread, I still don't understand why it's unsafe to
set HEAP_XMIN_COMMITTED in those conditions. Even if it is, I would
think that a sufficiently narrow case -- such as CTAS outside of a
transaction block -- would be safe, along with some slightly broader
cases (like BEGIN; CREATE TABLE; INSERT/COPY).

But perhaps that requires more discussion. I was just curious if there
was a concrete problem that I was missing.

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2012-12-04 20:38:06 Re: WIP: store additional info in GIN index
Previous Message Alvaro Herrera 2012-12-04 20:35:24 Re: WIP: store additional info in GIN index