Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Maciek Sakrejda <m(dot)sakrejda(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
Date: 2022-02-18 15:44:17
Message-ID: CA+TgmobmaOMC7YoExZVHJW9PW_h-RWW+EE3xjN7DpDJRTkHe0w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 17, 2022 at 6:39 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > The only other kind of hazard I can think of is: could it be unsafe to
> > try to interpret the contents of a database to which no one else is
> > connected at present due to any of the issues you mention? But what's
> > the hazard exactly?
>
> I don't quite know. But I don't think it's impossible to run into trouble in
> this area. E.g. xid horizons are computed in a database specific way. If the
> routine reading pg_class did hot pruning, you could end up removing data
> that's actually needed for a logical slot in the other database because the
> backend local horizon state was computed for the "local" database?

Yeah, but it doesn't -- and shouldn't. There's no HeapScanDesc here,
so we can't accidentally wander into heap_page_prune_opt(). We should
document the things we're thinking about here in the comments to
prevent future mistakes, but I think for the moment we are OK.

> Could there be problems because other backends wouldn't see the backend
> accessing the other database as being connected to that database
> (PGPROC->databaseId)?

I think that if there's any hazard here, it must be related to
snapshots, which brings us to your next point:

> Or what if somebody optimized snapshots to disregard readonly transactions in
> other databases?

So there are two related questions here. One is whether the snapshot
that we're using to access the template database can be unsafe, and
the other is whether the read-only access that we're performing inside
the template database could mess up somebody else's snapshot. Let's
deal with the second point first: nobody else knows that we're reading
from the template database, and nobody else is reading from the
template database except possibly for someone who is doing exactly
what we're doing. Therefore, I think this hazard can be ruled out.

On the first point, a key point in my opinion is that there can be no
in-flight transactions in the template database, because nobody is
connected to it, and prepared transactions in a template database are
verboten. It therefore can't matter if we include too few XIDs in our
snapshot, or if our xmin is too new. The reverse case can matter,
though: if the xmin of our snapshot were too old, or if we had extra
XIDs in our snapshot, then we might think that a transaction is still
in progress when it isn't. Therefore, I think the patch is wrong to
use GetActiveSnapshot() and must use GetLatestSnapshot() *after* it's
finished making sure that nobody is using the template database. I
don't think there's a hazard beyond that, though. Let's consider the
two ways in which things could go wrong:

1. Extra XIDs in the snapshot. Any current or future optimization of
snapshots would presumably be trying to make them smaller by removing
XIDs from the snapshot, not making them bigger by adding XIDs to the
snapshot. I guess in theory you can imagine an optimization that tests
for the presence of XIDs by some method other than scanning through an
array, and which feels free to add XIDs to the snapshot if they "can't
matter," but I think it's up to the author of that hypothetical future
patch to make sure they don't break anything in so doing -- especially
because it's entirely possible for our session to see XIDs used by a
backend in some other database, because they could show up in shared
catalogs. I think that's why, as far as I can tell, we only use the
database ID when setting pruning thresholds, and not for snapshots.

2. xmin of snapshot too new. There are no in-progress transactions in
the template database, so how can this even happen? If we set the xmin
"in the future," then we could get confused about what's visible due
to wraparound, but that seems crazy. I don't see how there can be a
problem here.

> > It can't be a problem if we've failed to process sinval messages for the
> > target database, because we're not using any caches anyway.
>
> Could you end up with an outdated relmap entry? Probably not.

Again, we're not relying on caching -- we read the file.

> > We can't. It can't be unsafe to test visibility of XIDs for that database,
> > because in an alternate universe some backend could have connected to that
> > database and seen the same XIDs.
>
> That's a weak argument, because in that alternative universe a PGPROC entry
> with the PGPROC->databaseId = template_databases_oid would exist.

So what? As I also argue above, I don't think that affects snapshot
generation, and if it did it wouldn't matter anyway, because it could
only remove in-progress transactions from the snapshot, and there
aren't any in the template database anyhow.

Another way of looking at this is: we could just as well use
SnapshotSelf or (if it still existed) SnapshotNow to test visibility.
In a world where there are no transactions in flight, it's the same
thing. In fact, maybe we should do it that way, just to make it
clearer what's happening.

I think these are really good questions you are raising, so I'm not
trying to be dismissive. But after some thought I'm not yet seeing any
problems (other than the use of GetActiveSnapshot).

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2022-02-18 16:44:54 Re: O(n) tasks cause lengthy startups and checkpoints
Previous Message Tom Lane 2022-02-18 15:26:52 Re: Timeout control within tests