PostgreSQL Weekly News - December 27, 2020

Posted on 2020-12-27 by PWN

PostgreSQL Weekly News - December 27, 2020

PostgreSQL Product News

Pgpool-II 4.2.1, a connection pooler and statement replication system for PostgreSQL, released.

Postgres Operator v1.6.0 released.

PostgreSQL Jobs for December

PostgreSQL in the News

Planet PostgreSQL:

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm PST8PDT to

Applied Patches

Tom Lane pushed:

  • Doc: fix description of how to use src/tutorial files. The separate "cd" command before invoking psql made sense (or at least I thought so) when it was added in commit ed1939332. But 4e3a61635 removed the supporting text that explained when to use it, making it just confusing. So drop it. Also switch from four-dot to three-dot filler for the unsupplied part of the path, since at least one person has read the four-dot filler as a typo for "../..". And fix these/those inconsistency. Discussion:

  • Remove "invalid concatenation of jsonb objects" error case. The jsonb || jsonb operator arbitrarily rejected certain combinations of scalar and non-scalar inputs, while being willing to concatenate other combinations. This was of course quite undocumented. Rather than trying to document it, let's just remove the restriction, creating a uniform rule that unless we are handling an object-to-object concatenation, non-array inputs are converted to one-element arrays, resulting in an array-to-array concatenation. (This does not change the behavior for any case that didn't throw an error before.) Per complaint from Joel Jacobson. Back-patch to all supported branches. Discussion:

  • Increase timeout in Commit 7b28913bc figured 30 seconds is long enough for anybody, but in contexts like valgrind runs, it isn't necessarily.

  • Improve autoprewarm's handling of early-shutdown scenarios. Bad things happen if the DBA issues "pg_ctl stop -m fast" before autoprewarm finishes loading its list of blocks to prewarm. The current worker process successfully terminates early, but (if this wasn't the last database with blocks to prewarm) the leader process will just try to launch another worker for the next database. Since the postmaster is now in PM_WAIT_BACKENDS state, it ignores the launch request, and the leader just sits until it's killed manually. This is mostly the fault of our half-baked design for launching background workers, but a proper fix for that is likely to be too invasive to be back-patchable. To ameliorate the situation, fix apw_load_buffers() to check whether SIGTERM has arrived just before trying to launch another worker. That leaves us with only a very narrow window in each worker launch where SIGTERM could occur between the launch request and successful worker start. Another issue is that if the leader process does manage to exit, it unconditionally rewrites autoprewarm.blocks with only the blocks currently in shared buffers, thus forgetting any blocks that we hadn't reached yet while prewarming. This seems quite unhelpful, since the next database start will then not have the expected prewarming benefit. Fix it to not modify the file if we shut down before the initial load attempt is complete. Per bug #16785 from John Thompson. Back-patch to v11 where the autoprewarm code was introduced. Discussion:

  • Improve client error messages for immediate-stop situations. Up to now, if the DBA issued "pg_ctl stop -m immediate", the message sent to clients was the same as for a crash-and-restart situation. This is confusing, not least because the message claims that the database will soon be up again, something we have no business predicting. Improve things so that we can generate distinct messages for the two cases (and also recognize an ad-hoc SIGQUIT, should somebody try that). To do that, add a field to pmsignal.c's shared memory data structure that the postmaster sets just before broadcasting SIGQUIT to its children. No interlocking seems to be necessary; the intervening signal-sending and signal-receipt should sufficiently serialize accesses to the field. Hence, this isn't any riskier than the existing usages of pmsignal.c. We might in future extend this idea to improve other postmaster-to-children signal scenarios, although none of them currently seem to be as badly overloaded as SIGQUIT. Discussion:

  • Fix race condition between shutdown and unstarted background workers. If a database shutdown (smart or fast) is commanded between the time some process decides to request a new background worker and the time that the postmaster can launch that worker, then nothing happens because the postmaster won't launch any bgworkers once it's exited PM_RUN state. This is fine ... unless the requesting process is waiting for that worker to finish (or even for it to start); in that case the requestor is stuck, and only manual intervention will get us to the point of being able to shut down. To fix, cancel pending requests for workers when the postmaster sends shutdown (SIGTERM) signals, and similarly cancel any new requests that arrive after that point. (We can optimize things slightly by only doing the cancellation for workers that have waiters.) To fit within the existing bgworker APIs, the "cancel" is made to look like the worker was started and immediately stopped, causing deregistration of the bgworker entry. Waiting processes would have to deal with premature worker exit anyway, so this should introduce no bugs that weren't there before. We do have a side effect that registration records for restartable bgworkers might disappear when theoretically they should have remained in place; but since we're shutting down, that shouldn't matter. Back-patch to v10. There might be value in putting this into 9.6 as well, but the management of bgworkers is a bit different there (notably see 8ff518699) and I'm not convinced it's worth the effort to validate the patch for that branch. Discussion:

  • Avoid time-of-day-dependent failure in log rotation test. Buildfarm members pogona and petalura have shown a failure when pg_ctl/t/ starts just before local midnight. The default rotate-at-midnight behavior occurs just before the Perl script examines current_logfiles, so it figures that the rotation it's already requested has occurred ... but in reality, that rotation happens just after it looks, so the expected new log data goes into a different file than the one it's examining. In HEAD, src/test/kerberos/t/ has acquired similar code that evidently has a related failure mode. Besides being quite new, few buildfarm critters run that test, so it's unsurprising that we've not yet seen a failure there. Fix both cases by setting log_rotation_age = 0 so that no time-based rotation can occur. Also absorb's decision to set lc_messages = 'C' into the kerberos test, in hopes that it will work in non-English prevailing locales. Report: Report:

  • Really fix the dummy implementations in cipher.c. 945083b2f wasn't enough to silence compiler warnings.

Alexander Korotkov pushed:

Michaël Paquier pushed:

  • Refactor logic to check for ASCII-only characters in string. The same logic was present for collation commands, SASLprep and pgcrypto, so this removes some code. Author: Michael Paquier Reviewed-by: Stephen Frost, Heikki Linnakangas Discussion:

  • Fix portability issues with parsing of recovery_target_xid. The parsing of this parameter has been using strtoul(), which is not portable across platforms. On most Unix platforms, unsigned long has a size of 64 bits, while on Windows it is 32 bits. It is common in recovery scenarios to rely on the output of txid_current() or even the newer pg_current_xact_id() to get a transaction ID for setting up recovery_target_xid. The value returned by those functions includes the epoch in the computed result, which would cause strtoul() to fail where unsigned long has a size of 32 bits once the epoch is incremented. WAL records and 2PC data include only information about 32-bit XIDs and it is not possible to have XIDs across more than one epoch, so discarding the high bits from the transaction ID set has no impact on recovery. On the contrary, the use of strtoul() prevents a consistent behavior across platforms depending on the size of unsigned long. This commit changes the parsing of recovery_target_xid to use pg_strtouint64() instead, available down to 9.6. There is one TAP test stressing recovery with recovery_target_xid, where a tweak based on pg_reset{xlog,wal} is added to bump the XID epoch so as this change gets tested, as per an idea from Alexander Lakhin. Reported-by: Alexander Lakhin Discussion: Backpatch-through: 9.6

  • Fix typos and grammar in docs and comments. This fixes several areas of the documentation and some comments in matters of style, grammar, or even format. Author: Justin Pryzby Discussion:

Tomáš Vondra pushed:

Bruce Momjian pushed:

Noah Misch pushed:

Jeff Davis pushed:

Pending Patches

Kyotaro HORIGUCHI sent in another revision of a patch to make the stats collector use shared memory instead of files for intermediate storage.

Kyotaro HORIGUCHI sent in another revision of a patch to fix NaN handling for geometric data types.

Masahiko Sawada sent in a patch to ensure that a schema created when an extension is created is also dropped when the extension is dropped.

Justin Pryzby sent in two revisions of a patch to add zstd compression to pg_dump.

Peter Smith sent in five more revisions of a patch to allow the table-sync worker to use multiple transactions.

Bharath Rupireddy and Michaël Paquier traded patches to fail fast in CTAS/CMV if the relation already exists.

Konstantin Knizhnik and Pavel Stěhule traded patches to add a login event that can be used in an event trigger.

Bharath Rupireddy sent in four more revisions of a patch to use parallel inserts in CTAS.

Zeng Wenjing sent in another revision of a patch to implement global temporary tables.

Amit Langote sent in another revision of a patch to create foreign key triggers in partitioned tables, too, and use same to enforce foreign keys correctly during cross-partition updates.

Amit Langote sent in another revision of a patch to initialize the result relation information lazily.

Thomas Munro sent in another revision of a patch to provide synchronous scatter/gather I/O routines, pg_preadv() and pg_pwritev(), which correspond to preadv() and pwritev() on systems that have those, and use vectored I/O to zero WAL segments.

Peter Geoghegan sent in a patch to instrument heap_compute_xid_horizon_for_tuples().

David Rowley sent in another revision of a patch to reduce contrib build special cases on Windows.

Yugo Nagata sent in another revision of a patch to implement incremental maintenance of materialized views.

Michaël Paquier and Justin Pryzby traded patches to allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly.

Masahiko Sawada sent in a patch to add a new IndexAM API controlling index vacuum strategies.

Fujii Masao sent in two more revisions of a patch to fix a recovery_conflict_lock deadlock.

Amit Kapila and Kirk Jamison traded patches to make dropping relation buffers more efficient using dlist.

Pavel Stěhule sent in two more revisions of a patch to implement schema variables.

Dmitry Dolgov sent in two more revisions of a patch to use the new subscripting infrastructure for JSONB.

Michaël Paquier sent in a patch to clean up some -I$(libpq_srcdir) in makefiles.

Craig Ringer sent in a patch to add a $node->gdb_backends() method to PostgresNode instances.

Andrey V. Lepikhov sent in another revision of a patch to speed up COPY FROM in the case tables with foreign partitions by implementing some new methods in the FDW API: BeginForeignCopy, EndForeignCopy, and ExecForeignCopy.

Justin Pryzby sent in another revision of a patch to make pg_ls_*() show directories and shared filesets.

Simon Riggs sent in a patch to add a boolean transaction_cleanup GUC, defaulting to off, which is intended to replace uses of DISCARD ALL in things like connection poolers.

Ajin Cherian sent in another revision of a patch to implement logical decoding of two-phase transactions.

Fujii Masao and Bharath Rupireddy traded patches to intended to fix a bug that manifested as connection leaks from the postgres_fdw when the associated user mapping is dropped by closing invalid connections either in pgfdw_inval_callback() when they are not in a transaction, or in pgfdw_xact_callback(), which gets called at the end of every act once registered, in the current session.

Li Japin sent in a patch to document the streaming replication protocol better.

Thomas Munro sent in another revision of a patch to implement prefetch for WALs.

Bharath Rupireddy sent in another revision of a patch to add new table AMs for multi- and single inserts.

Kyotaro HORIGUCHI sent in another revision of a patch to make it possible to change table persistence without a heap rewrite, and add a new command: ALTER TABLE ALL IN TABLESPACE SET LOGGED/UNLOGGED.

Kyotaro HORIGUCHI sent in a patch to pg_waldump that limits the number of lines shown at the start.

Masahiro Ikeda sent in another revision of a patch to add WAL write/fsync statistics to the pg_stat_wal view.

Nikita Glukhov sent in another revision of a patch to add SQL/JSON functions.

Nikita Glukhov sent in another revision of a patch to implement JSON_TABLE.

Dilip Kumar and Andrey Borodin traded patches to implement custom table compression methods.

Dmitry Dolgov sent in another revision of a patch to prevent jumbling of every element in ArrayExpr.

Andrey Borodin sent in another revision of a patch to reorganize the pglz compression code to make it more efficient. This gained a speedup of about 1.4x.

Julien Rouhaud sent in another revision of a patch to add a boolean column, toplevel, to pg_stat_statements.

Fabien COELHO sent in a patch to add table access method as an option to pgbench.

Justin Pryzby sent in another revision of a patch to add a pg_upgrade test for binary compatibility of core data types.

Alexander Korotkov sent in a patch to fix bugs in comparison functions for multirange_bsearch_match(), implement operators for checking if the range contains a multirange, improve the signatures of internal multirange functions by adding const qualifies where applicable and replacing the multirange typecache argument with range typecache argument, adding GiST indexes for multiranges, and adding support for multirange matching to the existing range GiST indexes.

Justin Pryzby sent in another revision of a patch to fix typos and do some wordsmithing for the latest documentation.