Re: Inaccuracy in VACUUM's tuple count estimates

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Inaccuracy in VACUUM's tuple count estimates
Date: 2014-06-09 16:55:29
Message-ID: 20140609165529.GE8406@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2014-06-09 09:45:12 -0700, Kevin Grittner wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> >     HEAPTUPLE_INSERT_IN_PROGRESS,    /* inserting xact is still in progress */
> >     HEAPTUPLE_DELETE_IN_PROGRESS    /* deleting xact is still in progress */
> > the current code will return INSERT_IN_PROGRESS even if the tuple has
> > *also* been deleted in another xact...
> > I think the problem here is that there's simply no way to really
> > represent that case accurately with the current API.
>
> For purposes of predicate.c, if the "also deleted" activity might
> be rolled back without rolling back the insert, INSERT_IN_PROGRESS
> is the only correct value.  If they will either both commit or
> neither will commit, predicate.c would be more efficient if
> HEAPTUPLE_RECENTLY_DEAD was returned, but I
> HEAPTUPLE_INSERT_IN_PROGRESS would be OK from a correctness PoV.

That's basically the argument for the new behaviour.

But I am not sure, given predicate.c's coding, how
HEAPTUPLE_DELETE_IN_PROGRESS could cause problems. Could you elaborate,
since that's the contentious point with Tom? Since 'both in progress'
can only happen if xmin and xmax are the same toplevel xid and you
resolve subxids to toplevel xids I think it should currently be safe
either way?

> >     HEAPTUPLE_RECENTLY_DEAD,    /* tuple is dead, but not deletable yet */
> > 1) xmin has committed, xmax has committed and wasn't only a locker. But
> > xmax doesn't precede OldestXmin.
>
> For my purposes, it would be better if this also included:
>  2) xmin is in progress, xmax matches (or includes) xmin
>
> ... but that would be only a performance tweak.

I don't see that happening as there's several callers for which it is
important to know whether the xacts are still alive or not.

> >     HEAPTUPLE_DELETE_IN_PROGRESS    /* deleting xact is still in progress */
> > new:
> > 1) xmin has committed, xmax is in progress, xmax is not just a locker
> > 2) xmin is in progress, xmin is the current backend, xmax is not just a
> >   locker and in progress.
>
> I'm not clear on how 2) could happen unless xmax is the current
> backend or a subtransaction thereof.  Could you clarify?
>
> > old:
> > 1) xmin has committed, xmax is in progress, xmax is not just a locker
> > 2) xmin is in progress, xmax is set and not not just a locker
> >
> > Note that the 2) case here never checked xmax's status.
>
> Again, I'm not sure how 2) could happen unless they involve the
> same top-level transaction.  What am I missing?

Right, both can only happen if the tuple is created & deleted in the
same backend. Is that in contradiction to something you see?

Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Claudio Freire 2014-06-09 16:58:53 Re: Extended Prefetching using Asynchronous IO - proposal and patch
Previous Message Kevin Grittner 2014-06-09 16:45:12 Re: Inaccuracy in VACUUM's tuple count estimates