Re: GUC for cleanup indexes threshold.

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, "Ideriha, Takeshi" <ideriha(dot)takeshi(at)jp(dot)fujitsu(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>
Subject: Re: GUC for cleanup indexes threshold.
Date: 2017-02-27 17:46:08
Message-ID: CAH2-WzmVN3PZNno4b4wjDfEsKSaMXujfxs9iJ5YPG3H5036Gqw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Feb 25, 2017 at 10:51 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> The thing that strikes me based on what you wrote is that our page
> recycling seems to admit of long delays even as things stand. There's
> no bound on how much time could pass between one index vacuum and the
> next, and RecentGlobalXmin could and probably usually will advance
> past the point that would allow recycling long before the next index
> vacuum cycle.

Agreed.

> I don't know whether that strengthens or weakens
> Simon's argument.

I think it weakens it, but I hesitate to take a firm position on it just yet.

> On the one hand, you could argue that if we're
> already doing this on a long delay, making it even longer probably
> won't hurt much. On the other hand, you could argue that if the
> current situation is bad, we should at least avoid making it worse. I
> don't know which of those arguments is correct, if either.

Unsure. I will point out:

* There probably is a big problem with B-Tree index bloat for certain
workloads ("sparse deletion patterns"), that interacts badly with less
frequent VACUUMing.

* Whatever the bloat this patch makes worse is not *that* bloat, at
least with the proposed default for vacuum_cleanup_index_scale_factor;
it's not the bloat we usually think of when we talk about index bloat.
A full index scan will not need to visit any of the dead pages, even
just to immediately skip over them. We just won't be able to recycle
them, which is another problem.

* The problem of not recycling as soon as we'd prefer can only happen
when everything else is, roughly speaking, going right. Which is still
pretty good. (Again, remarks only apply when the default
vacuum_cleanup_index_scale_factor is used.)

* Roughly speaking, the recycling of disk blocks, and efficient use of
disk space more generally is not a priority for the implementation.
Nor should it be.

I tend to think of this recycling as being more about the worst case
for space utilization than about the average case. Kind of like the
fast root vs. true root thing prevents our needing to descend "skinny"
B-Trees from the true root, which can only really happen following
vast changes to the key space, which are always going to be painful.
These cases are a bit pathological.

For more on this recycling stuff, see section 2.5 of the Lanin and
Shasha paper, "Freeing empty nodes" [1]. It's describing what is
essentially the RecentGlobalXmin interlock, and I think you're right
to point out that that could stand to be a lot more aggressive, which
is maybe the real problem, if any. (The papers remarks suggest we
could stand to be more eager about it.)

> Do you have an idea about that, or any ideas for experiments we could try?

Nothing occurs to me right now, unfortunately. However, my general
sense is that it would probably be just fine when
vacuum_cleanup_index_scale_factor was 0.0, but there might be
non-linear increases in "the serious type of index bloat" as the
proposed new setting was scaled up. I'd be much more worried about
that.

[1] https://archive.org/stream/symmetricconcurr00lani#page/6/mode/2up
--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2017-02-27 17:51:18 Re: gitlab post-mortem: pg_basebackup waiting for checkpoint
Previous Message Tomas Vondra 2017-02-27 17:46:06 Re: PATCH: two slab-like memory allocators