Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, amul sul <sulamul(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key
Date: 2018-03-28 17:52:24
Message-ID: 1898.1522259544@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> On 2018-03-08 13:46:53 -0500, Tom Lane wrote:
>> ... Breaking fundamental invariants like
>> "ctid points to this tuple or its update successor" is going to cause
>> trouble. There's a lot of code that knows that; more than knows the
>> details of what's in xmax, I believe.

> Given, as explained nearby, we already do store transient data in the
> ctid for speculative insertions (i.e. ON CONFLICT), and it hasn't caused
> even a whiff of trouble, I'm currently not inclined to see a huge issue
> here. It'd be great if you could expand on your concerns here a bit, we
> gotta figure out a way forward.

Just what I said. There's a lot of code that knows how to follow tuple
update chains, probably not all of it in core, and this will break it.
But only in seldom-exercised corner cases, which is the worst of all
possible worlds from a reliability standpoint. (I don't think ON CONFLICT
is a counterexample because, IIUC, it's not a persistent state.)

Given that there are other ways we could attack it, I think throwing away
this particular invariant is an unnecessarily risky solution.

>> I would've been happier about expending an infomask bit towards this
>> purpose. Just eyeing what we've got, I can't help noticing that
>> HEAP_MOVED_OFF/HEAP_MOVED_IN couldn't possibly be set in any tuple
>> in a partitioned table. Perhaps making these tests depend on
>> partitioned-ness would be unworkably messy, but it's worth thinking
>> about.

> They previously couldn't be set together IIRC, so we could just (mask &
> (HEAP_MOVED_OFF |HEAP_MOVED_IN)) == (HEAP_MOVED_OFF |HEAP_MOVED_IN) but
> that'd be permanently eating two infomask bits.

Hmm. That objection only matters if we have realistic intentions of
reclaiming those bits in future, which I've not heard anyone making
serious effort towards. Rather than messing with the definition of ctid,
I'd be happier with saying that they're never going to be reclaimed, but
at least we're getting one bit's worth of real use out of them.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2018-03-28 18:12:11 Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key
Previous Message Emre Hasegeli 2018-03-28 17:50:16 Re: Standby corruption after master is restarted