Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

From: Alexey Kondratov <a(dot)kondratov(at)postgrespro(dot)ru>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Erik Rijkers <er(at)xs4all(dot)nl>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Date: 2019-02-04 13:49:08
Message-ID: 6dcf26af-2ebc-4149-8c60-070ec6214954@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Tomas,

On 14.01.2019 21:23, Tomas Vondra wrote:
> Attached is an updated patch series, merging fixes and changes to TAP
> tests proposed by Alexey. I've merged the fixes into the appropriate
> patches, and I've kept the TAP changes / new tests as separate patches
> towards the end of the series.

I had problems applying this patch along with 2pc streaming one to the
current master, but everything applied well on 97c39498e5. Regression
tests pass. What I personally do not like in the current TAP tests set
is that you have added "WITH (streaming=on)" to all tests including old
non-streaming ones. It seems unclear, which mechanism is tested there:
streaming, but those transactions probably do not hit memory limit, so
it depends on default server parameters; or non-streaming, but then what
is the need for (streaming=on)? I would prefer to add (streaming=on)
only to the new tests, where it is clearly necessary.

> I'm a bit unhappy with two aspects of the current patch series:
>
> 1) We now track schema changes in two ways - using the pre-existing
> schema_sent flag in RelationSyncEntry, and the (newly added) flag in
> ReorderBuffer. While those options are used for regular vs. streamed
> transactions, fundamentally it's the same thing and so having two
> competing ways seems like a bad idea. Not sure what's the best way to
> resolve this, though.

Yes, sure, when I have found problems with streaming of extensive DDL, I
added new flag in the simplest way, and it worked. Now, old schema_sent
flag is per relation based, while the new one - is_schema_sent - is per
top-level transaction based. If I get it correctly, the former seems to
be more thrifty, since new schema is sent only if we are streaming
change for relation, whose schema is outdated. In contrast, in the
latter case we will send new schema even if there will be no new changes
which belong to this relation.

I guess, it would be better to stick to the old behavior. I will try to
investigate how to better use it in the streaming mode as well.

> 2) We've removed quite a few asserts, particularly ensuring sanity of
> cmin/cmax values. To some extent that's expected, because by allowing
> decoding of in-progress transactions relaxes some of those rules. But
> I'd be much happier if some of those asserts could be reinstated, even
> if only in a weaker form.

Asserts have been removed from two places: (1)
HeapTupleSatisfiesHistoricMVCC, which seems inevitable, since we are
touching the essence of the MVCC visibility rules, when trying to decode
an in-progress transaction, and (2) ReorderBufferBuildTupleCidHash,
which is probably not related directly to the topic of the ongoing
patch, since Arseny Sher faced the same issue with simple repetitive DDL
decoding [1] recently.

Not many, but I agree, that replacing them with some softer asserts
would be better, than just removing, especially point 1).

[1] https://www.postgresql.org/message-id/flat/874l9p8hyw.fsf%40ars-thinkpad

Regards

--
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andreas Karlsson 2019-02-04 15:10:09 Re: Feature: temporary materialized views
Previous Message Robert Haas 2019-02-04 12:54:15 Re: ATTACH/DETACH PARTITION CONCURRENTLY