Re: Enabling B-Tree deduplication by default

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: Enabling B-Tree deduplication by default
Date: 2020-01-30 22:13:43
Message-ID: CAH2-Wz=H54RD4AwfR=pNXkZH7S6kdmpmYMZH_y7d5n+2KQ=Bwg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 30, 2020 at 12:57 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> That seems reasonable.

My approach to showing the downsides of the patch wasn't particularly
obvious, or easy to come up with. I could have contrived a case like
the insert benchmark, but with more low cardinality non-unique
indexes. That would also have the effect of increasing the memory
bandwidth/latency bottleneck, which was the big bottleneck already.
It's not clear that if that makes the patch look worse or better. You
end up executing more instructions that go to waste, but with a
workload where the CPU stalls on memory even more than the original
insert benchmark.

OTOH, it's possible to contrive a case that makes the patch look
better than the master branch to an extreme extent. Just keep adding
low cardinality columns that each get an index, on a table that gets
many non-HOT updates. The effect isn't even linear, because VACUUM has
a harder time with keeping up as you add columns/indexes, making the
bloat situation worse, in turn making it harder for VACUUM to keep up.
For bonus points, make sure that the tuples are nice and wide -- that
also "amplifies" bloat in a non-intuitive way (which is an effect that
is also ameliorated by the patch).

> I suspect that you're right that the worst-case downside is not big
> enough to really be a problem given all the upsides. But the advantage
> of getting things committed is that we can find out what users think.

It's certainly impossible to predict everything. On the upside, I
suspect that the patch makes VACUUM easier to tune with certain real
world workloads, though that is hard to prove.

I've always disliked the way that autovacuum gets triggered by fairly
generic criteria. Timeliness can matter a lot when it comes to index
bloat, but that isn't taken into account. I think that the patch will
tend to bring B-Tree indexes closer to heap tables in terms of their
overall sensitivity to how frequently VACUUM runs.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2020-01-30 22:18:14 Re: Enabling B-Tree deduplication by default
Previous Message Arseny Sher 2020-01-30 21:22:46 Re: ERROR: subtransaction logged without previous top-level txn record