Re: MaxOffsetNumber for Table AMs

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MaxOffsetNumber for Table AMs
Date: 2021-05-03 15:03:06
Message-ID: CA+TgmoaQro9E6orfMgj-2oCmj7dCVRR24jW2htS6wuUsNLAx_w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 30, 2021 at 6:19 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> A remaining problem is that we must generate a new round of index
> tuples for each and every index when only one indexed column is
> logically modified by an UPDATE statement. I think that this is much
> less of a problem now due to bottom-up index deletion. Sure, it sucks
> that we still have to dirty the page at all. But it's nevertheless
> true that it all but eliminates version-driven page splits, which are
> where almost all of the remaining downside is. It's very reasonable to
> now wonder if this particular all-indexes problem is worth solving at
> all in light of that. (Modern hardware characteristics also make a
> comprehensive fix less valuable in practice.)

It's reasonable to wonder. I think it depends on whether the problem
is bloat or just general slowness. To the extent that the problem is
bloat, bottom-index deletion will help a lot, but it's not going to
help with slowness because, as you say, we still have to dirty the
pages. And I am pretty confident that slowness is a very significant
part of the problem here. It's pretty common for people migrating from
another database system to have, for example, a table with 10 indexes
and then repeatedly update a column that is covered by only one of
those indexes. Now, with bottom-up index deletion, this should cause a
lot less bloat, and that's good. But you still have to update all 10
indexes in the foreground, and that's bad, because the alternative is
to find just the one affected index and update it twice -- once to
insert the new tuple, and a second time to delete-mark the old tuple.
10 is a lot more than 2, and that's even ignoring the cost of deferred
cleanup on the other 9 indexes. So I don't really expect this to get
us out of the woods. Somebody whose workload runs five times slower on
a pristine data load is quite likely to give up on using PostgreSQL
before bloat even enters the picture.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2021-05-03 15:22:52 Re: Granting control of SUSET gucs to non-superusers
Previous Message Tom Lane 2021-05-03 14:47:47 Re: strange error reporting