Re: pgsql: Compute XID horizon for page level index vacuum on primary.

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-committers <pgsql-committers(at)lists(dot)postgresql(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pgsql: Compute XID horizon for page level index vacuum on primary.
Date: 2019-03-29 15:58:14
Message-ID: CANP8+jLEWNQX9oW0RQPPvOXFOh3zEBUdC62QWZ2GLNkeZmXnPA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Fri, 29 Mar 2019 at 15:29, Andres Freund <andres(at)anarazel(dot)de> wrote:

> On 2019-03-29 09:37:11 +0000, Simon Riggs wrote:
>

> > While trying to understand this, I see there is an even better way to
> > optimize this. Since we are removing dead index tuples, we could alter
> the
> > killed index tuple interface so that it returns the xmax of the tuple
> being
> > marked as killed, rather than just a boolean to say it is dead.
>
> Wouldn't that quite possibly result in additional and unnecessary
> conflicts? Right now the page level horizon is computed whenever the
> page is actually reused, rather than when an item is marked as
> deleted. As it stands right now, the computed horizons are commonly very
> "old", because of that delay, leading to lower rates of conflicts.
>

I wasn't suggesting we change when the horizon is calculated, so no change
there.

The idea was to cache the data for later use, replacing the hint bit with a
hint xid.

That won't change the rate of conflicts, up or down - but it does avoid I/O.

> > Indexes can then mark the killed tuples with the xmax that killed them
> > rather than just a hint bit. This is possible since the index tuples
> > are dead and cannot be used to follow the htid to the heap, so the
> > htid is redundant and so the block number of the tid could be
> > overwritten with the xmax, zeroing the itemid. Each killed item we
> > mark with its xmax means one less heap fetch we need to perform when
> > we delete the page - it's possible we optimize that away completely by
> > doing this.
>
> That's far from a trivial feature imo. It seems quite possible that we'd
> end up with increased overhead, because the current logic can get away
> with only doing hint bit style writes - but would that be true if we
> started actually replacing the item pointers? Because I don't see any
> guarantee they couldn't cross a page boundary etc? So I think we'd need
> to do WAL logging during index searches, which seems prohibitively
> expensive.
>

Don't see that.

I was talking about reusing the first 4 bytes of an index tuple's
ItemPointerData,
which is the first field of an index tuple. Index tuples are MAXALIGNed, so
I can't see how that would ever cross a page boundary.

> And I'm also doubtful it's worth it because:
>
> > Since this point of the code is clearly going to be a performance issue
> it
> > seems like something we should do now.
>
> I've tried quite a bit to find a workload where this matters, but after
> avoiding redundant buffer accesses by sorting, and prefetching I was
> unable to do so. What workload do you see where this would be really be
> bad? Without the performance optimization I'd found a very minor
> regression by trying to force the heap visits to happen in a pretty
> random order, but after sorting that went away. I'm sure it's possible
> to find a case on overloaded rotational disks where you'd find a small
> regression, but I don't think it'd be particularly bad.
>

The code can do literally hundreds of random I/Os in an 8192 blocksize.
What happens with 16 or 32kB?

"Small regression" ?

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Andres Freund 2019-03-29 16:00:58 pgsql: Show table access methods as such in psql's \dA.
Previous Message Andres Freund 2019-03-29 15:38:07 pgsql: tableam: Comment fixes.

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2019-03-29 16:01:23 Re: REINDEX CONCURRENTLY 2.0
Previous Message Bossart, Nathan 2019-03-29 15:53:05 Re: REINDEX CONCURRENTLY 2.0