Re: Unhappy about API changes in the no-fsm-for-small-rels patch

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, John Naylor <john(dot)naylor(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Unhappy about API changes in the no-fsm-for-small-rels patch
Date: 2019-05-06 17:51:33
Message-ID: CA+TgmoY7dRdGkS7sf2Tmh7XQaVCsrmEsDGSvpDk6fVbz+=HtGg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 6, 2019 at 12:18 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > None of that addresses the question of the distributed cost of sending
> > more sinval messages. If you have a million little tiny relations and
> > VACUUM goes through and clears one tuple out of each one, it will be
> > spewing sinval messages really, really fast. How can that fail to
> > threaten extra sinval resets?
>
> Vacuum triggers sinval messages already (via the pg_class update),
> shouldn't be too hard to ensure that there's no duplicate ones in this
> case.

Yeah, if we can piggyback on the existing messages, then we can be
confident that we're not increasing the chances of sinval resets.

> > Well, that seems like an argument that we just shouldn't do this at
> > all. If the FSM is worthless for small relations, then eliding it
> > makes sense. But if having it is valuable even when the relation is
> > tiny, then eliding it is the wrong thing to do, isn't it?
>
> Why? The problem with the entirely stateless proposal is just that we'd
> do that every single time we need new space. If we amortize that cost
> across multiple insertions, I don't think there's a problem?

Hmm, I see.

> Note that without additional state we do not *know* that the heap is 5
> pages long, we have to do an smgrnblocks() - which is fairly
> expensive. That's precisely why I want to keep state about a
> non-existant FSM in the relcache, and why'd need sinval messages to
> clear that. So we don't incur unnecessary syscalls when there's free
> space.

Makes sense.

> > I guess you could incur the overhead repeatedly if the relation starts
> > out at 1 block, grows to 4, is vacuumed back down to 1, lather, rinse,
> > repeat, but is that actually realistic? It requires all the live
> > tuples to live in block 0 at the beginning of each vacuum cycle, which
> > seems like a fringe outcome.
>
> I think it's much more likely to be encountered when there's a lot of
> churn on a small table, but HOT pruning removes just about all the
> superflous space on a regular basis. Then the relation might actually
> never get > 4 blocks.

Yeah, but if it leaves behind any tuples in block #3, the relation
will never be truncated. You can't repeatedly hit the
all-blocks-are-full case without repeatedly extending the relation,
and you can't repeatedly extend the relation without getting beyond 4
blocks unless you are also truncating it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-05-06 17:52:23 Re: make \d pg_toast.foo show its indices
Previous Message Robert Haas 2019-05-06 17:47:16 Re: Fixing order of resowner cleanup in 12, for Windows