Re: Logical Decoding and HeapTupleSatisfiesVacuum assumptions

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Nikhil Sontakke <nikhils(at)2ndquadrant(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Logical Decoding and HeapTupleSatisfiesVacuum assumptions
Date: 2018-01-22 19:21:17
Message-ID: CA+TgmoaZEwzsiPUMFnB1xhtZaiGmjrWd8syPup=6kc1WXdTf-Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 22, 2018 at 10:40 AM, Petr Jelinek
<petr(dot)jelinek(at)2ndquadrant(dot)com> wrote:
> I think this is the only real problem from your list for logical
> decoding catalog snapshots. But it's indeed quite a bad one. Is there
> something preventing us to remove the assumption that the CTID of T1 is
> garbage nobody cares about? I guess turning off HOT for catalogs is not
> an option :)

It doesn't seem like a great idea. For a lot of workloads it wouldn't
hurt, but for some it might cause a lot of catalog bloat. Also, Tom
would probably hate it with a fiery passion, since as I understand it
he argued passionately for HOT to work for both catalog and
non-catalog tables back when it was first implemented.

CTID chain integrity is important for non-HOT updates too, at least if
any SELECT .. FOR UPDATE/SHARE or UPDATE operations are taking place.
Now, I suppose we're not going to do any of that during logical
decoding, but I guess I can't say I'm absolutely positive that there
are no other problem cases. If there aren't, you can imagine
disallowing HOT updates only on catalogs and only when xmax is newer
than the newest XID we'll never need to decode, or more narrowly
still, only when it's in a list of XIDs currently being decoded.

Independently of CTID, I also wonder if there's a risk of us trying to
read from a multixact that's been truncated away. I haven't checked
the multixact code in detail, but I bet there's no provision to keep
around multixacts that are only interesting to aborted transactions.
Conversely, what keeps us from freezing the xmin of a tuple that is
invisible only to some aborted transaction, but visible to all
committed transactions? Or just marking the page all-visible?

> General problem is that we have couple of assumptions
> (HeapTupleSatisfiesVacuum being one, what you wrote is another) about
> tuples from aborted transactions not being read by anybody. But if we
> want to add decoding of 2PC or transaction streaming that's no longer
> true so I think we should try to remove that assumption (even if we do
> it only for catalogs since that what we care about).

I'm extremely skeptical about this idea. I think that assumption is
fairly likely to be embedded in subtle ways in several more places
that we haven't thought about yet. Unless we're very, very careful
about doing something like this, we could end up with a system that
mostly seems to work but is actually unreliable in ways that can't be
fixed without non-back-patchable changes.

> The other option would be to make sure 2PC decoding/tx streaming does
> not read aborted transaction but that would mean locking the transaction
> every time we give control to output plugin. Given that output plugin
> may do network write, this would really mean locking the transaction for
> and unbounded period of time. That does not strike me as something we
> want to do, decoding should not interact with frontend transaction
> management, definitely not this badly.

Yeah, I don't think it's acceptable to postpone abort indefinitely.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2018-01-22 19:26:06 Re: pgsql: Move handling of database properties from pg_dumpall into pg_dum
Previous Message Tom Lane 2018-01-22 19:09:20 pgsql: Move handling of database properties from pg_dumpall into pg_dum