Re: what to revert

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Ants Aasma <ants(dot)aasma(at)eesti(dot)ee>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: what to revert
Date: 2016-05-04 20:42:25
Message-ID: 20160504204225.oc6bbjl5mgkhg6dx@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2016-05-04 14:25:14 -0400, Robert Haas wrote:
> On Wed, May 4, 2016 at 12:42 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > On 2016-05-04 13:35:02 -0300, Alvaro Herrera wrote:
> >> Honestly, I don't see any strong ground in which to base a revert threat
> >> for this feature.
> >
> > It's datastructures are badly designed. But releasing it there's no
> > pressure to fix that. If this were an intrinsic cost - ok. But it's
> > not.
>
> I don't want to rule out the possibility that you are right, because
> you're frequently right about lots of things. However, you haven't
> provided much detail. AFAIK, the closest you've come to articulating
> a case for reverting this patch is to say that the tax ought to be
> paid by the write-side, rather than the read-side.

I think I did go into some more detail than that, but let me explain
here:

My concern isn't performing checks at snapshot build time and at buffer
access time. That seems fairly reasonable. My concern is that the
time->xid mapping used to determine the xid horizon is built at snapshot
access time. The relevant function for that is
MaintainOldSnapshotTimeMapping() called by GetSnapshotData(), if you
want to look at it. Building the mapping when building the snapshot
requires that an exclusive lwlock is taken. Given that there's very few
access patterns in which more xids are assigned than snapshots taken,
that just doesn't make much sense; even leaving the fact that adding an
exclusive lock in a function that's already one of the biggest
bottlenecks in postgres, isn't a good idea.

If we instead built the map somewhere around GetNewTransactionId() we'd
only pay the cost when advancing an xid. It'd also make it possible to
avoid another GetCurrentIntegerTimestamp() per snapshot, including the
acquiration of oldSnapshotControl->mutex_current. In addition the data
structure would be easier to maintain, because xids grow monotonically
(yes, yes wraparound, but that's not a problem) - atm we need somwehat
more complex logic to determine into which bucket an xid/timestamp pair
has to be mapped to.

So I'm really not just concerned with the performance, I think that's
just fallout from choosing the wrong data representation.

If you look at at bottom of
http://www.postgresql.org/message-id/20160413171955.i53me46fqqhdlztq@alap3.anarazel.de
which shows performance numbers for old_snapshot_threshold=0
vs. old_snapshot_threshold=10. While that wasn't intended at the time,
old_snapshot_threshold=0 shows the cost of the feature without
maintaining the mapping, and old_snapshot_threshold=10 shows it while
building the mapping. Pretty dramatic difference - that's really the
cost of map maintenance.

FWIW, it's not just me doubting the data structure here, Ants did as
well.

> Surely nobody here is blind to the fact that holding back xmin often
> creates a lot of bloat for no reason - many or all of the retained old
> row versions may never be accessed.

Definitely. I *like* the feature.

> So I would like to hear more detail about why you think that the data
> structures are badly designed, and how they could be designed
> differently without changing what the patch does (which amounts to
> wishing that Kevin had implemented a different feature than the one
> you think he should have implemented).

Well, there'd be some minor differences what determines "too old" based
on the above, but I think it'd just be a bit easier to explain.

> >> It doesn't scale as well as we would like in the case
> >> where a high-level is fully stressed with a read-only load -- okay. But
> >> that's unlikely to be a case where this feature is put to work.
> >
> > It'll be just the same in a read mostly workload, which is part of the
> > case for this feature.
>
> So what? SSI doesn't scale either, and nobody's saying we have to rip
> it out. It's still useful to people. pgbench is not the world.

Sure, pgbench is not the world. But this isn't a particularly pgbench
specific issue.

> >> So I think accepting the promise that this feature would be improved
> >> in a future release to better support that case is good enough.
> >
> > I've not heard any such promise.
>
> The question Alvaro is raising isn't whether such a promise has been
> made but whether it would suffice. In fairness, such promises are not
> enforceable. Kevin can promise to work on this and then be run over
> by a rogue Mr. Softee truck tomorrow, and the work won't get done; or
> he can go to work for some non-PostgreSQL-supporting company and
> vanish.

Sure, it's not a guarantee. But I think a "promise" (signed in blood of
course) of a senior contributor makes quite the difference.

> But personally, I generally agree with Alvaro's analysis. If this
> patch is slow when turned on, and you don't like that, don't use it.

> If you need to use it and want it to scale better, then you can write
> a patch to make it do that. Nobody is more qualified than you.

I think that's what ticks me off about this. By introducing a feature
implemented with the wrong structure, the onus of work is placed on
others. It's imo perfectly reasonable not to unneccesarily perform
micro-optimization before benchmarks show a problem, and if it were just
a question of doing that in 9.7, I'd be fine. But what we're talking
about is rewriting the central part of the feature.

> I think it's a show-stopper if the patch regresses performance more
> than trivially when turned off, but we need fresh test results before
> reaching a conclusion about that.

I'm not concerned about that at this stage. Kevin addressed those
problems as far as I'm aware.

> I also think it's a show-stopper if you can hold out specific design
> issues which we can generally agree are signs of serious flaws in
> Kevin's thinking.

I think that's my issue.

> I do not think it's a show-stopper if you wish he'd worked harder on
> the performance under heavy concurrency with very short transactions.
> You could have raised that issue at any time, but instead waited until
> after feature freeze.

Sorry, I don't buy that argument. There were senior community people
giving feedback on the patch, the potential for performance regressions
wasn't called out, the patch was updated pretty late in the CF. I find
it really scary that review didn't call out the potential for
regressions here, at the very least the ones with the feature disabled
really should have been caught. Putting exclusive lwlocks and spinlocks
in critical paths isn't a subtle, easy to miss, issue.

Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2016-05-04 20:50:26 Re: pg9.6 segfault using simple query (related to use fk for join estimates)
Previous Message Robert Haas 2016-05-04 20:37:52 Re: pg9.6 segfault using simple query (related to use fk for join estimates)