PostgreSQL Weekly News - January 3, 2021

Posted on 2021-01-03 by PWN

PostgreSQL Weekly News - January 3, 2021

Happy New Year from the PostgreSQL Weekly News!

PostgreSQL Product News

Database Lab 2.1, a tool for fast cloning of large PostgreSQL databases to build non-production environments, released:

PostgreSQL Jobs for January

PostgreSQL in the News

Planet PostgreSQL:

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm PST8PDT to

Applied Patches

Jeff Davis pushed:

Bruce Momjian pushed:

Fujii Masao pushed:

  • postgres_fdw: Fix connection leak. In postgres_fdw, the cached connections to foreign servers will not be closed until the local session exits if the user mappings or foreign servers that those connections depend on are dropped. Those connections can be leaked. To fix that connection leak issue, after a change to a pg_foreign_server or pg_user_mapping catalog entry, this commit makes postgres_fdw close the connections depending on that entry immediately if current transaction has not used those connections yet. Otherwise, mark those connections as invalid and then close them at the end of current transaction, since they cannot be closed in the midst of the transaction using them. Closed connections will be remade at the next opportunity if necessary. Back-patch to all supported branches. Author: Bharath Rupireddy Reviewed-by: Zhihong Yu, Zhijie Hou, Fujii Masao Discussion:

Michaël Paquier pushed:

Tom Lane pushed:

  • Fix thinko in plpgsql memory leak fix. Commit a6b1f5365 intended to place the transient "target" list of a CALL statement in the function's statement-lifespan context, but I fat-fingered that and used get_eval_mcontext() instead of get_stmt_mcontext(). The eval_mcontext belongs to the "simple expression" infrastructure, which is destroyed at transaction end. The net effect is that a CALL in a procedure to another procedure that has OUT or INOUT parameters would fail if the called procedure did a COMMIT. Per report from Peter Eisentraut. Back-patch to v11, like the prior patch. Discussion:

  • Further fix thinko in plpgsql memory leak fix. There's a second call of get_eval_mcontext() that should also be get_stmt_mcontext(). This is actually dead code, since no interesting allocations happen before switching back to the original context, but we should keep it in sync with the other call to forestall possible future bugs. Discussion:

  • Expose the default for channel_binding in PQconndefaults(). If there's a static default value for a connection option, it should be shown in the PQconninfoOptions array. Daniele Varrazzo Discussion:

  • Fix bugs in libpq's GSSAPI encryption support. The critical issue fixed here is that if a GSSAPI-encrypted connection is successfully made, pqsecure_open_gss() cleared conn->allow_ssl_try, as an admittedly-hacky way of preventing us from then trying to tunnel SSL encryption over the already-encrypted connection. The problem with that is that if we abandon the GSSAPI connection because of a failure during authentication, we would not attempt SSL encryption in the next try with the same server. This can lead to unexpected connection failure, or silently getting a non-encrypted connection where an encrypted one is expected. Fortunately, we'd only manage to make a GSSAPI-encrypted connection if both client and server hold valid tickets in the same Kerberos infrastructure, which is a relatively uncommon environment. Nonetheless this is a very nasty bug with potential security consequences. To fix, don't reset the flag, instead adding a check for conn->gssenc being already true when deciding whether to try to initiate SSL. While here, fix some lesser issues in libpq's GSSAPI code: * Use the need_new_connection stanza when dropping an attempted GSSAPI connection, instead of partially duplicating that code. The consequences of this are pretty minor: AFAICS it could only lead to auth_req_received or password_needed remaining set when they shouldn't, which is not too harmful. * Fix pg_GSS_error() to not repeat the "mprefix" it's given multiple times, and to notice any failure return from gss_display_status(). * Avoid gratuitous dependency on NI_MAXHOST in pg_GSS_load_servicename(). Per report from Mikael Gustavsson. Back-patch to v12 where this code was introduced. Discussion:

  • Fix assorted issues in backend's GSSAPI encryption support. Unrecoverable errors detected by GSSAPI encryption can't just be reported with elog(ERROR) or elog(FATAL), because attempting to send the error report to the client is likely to lead to infinite recursion or loss of protocol sync. Instead make this code do what the SSL encryption code has long done, which is to just report any such failure to the server log (with elevel COMMERROR), then pretend we've lost the connection by returning errno = ECONNRESET. Along the way, fix confusion about whether message translation is done by pg_GSS_error() or its callers (the latter should do it), and make the backend version of that function work more like the frontend version. Avoid allocating the port->gss struct until it's needed; we surely don't need to allocate it in the postmaster. Improve logging of "connection authorized" messages with GSS enabled. (As part of this, I back-patched the code changes from dc11f31a1.) Make BackendStatusShmemSize() account for the GSS-related space that will be allocated by CreateSharedBackendStatus(). This omission could possibly cause out-of-shared-memory problems with very high max_connections settings. Remove arbitrary, pointless restriction that only GSS authentication can be used on a GSS-encrypted connection. Improve documentation; notably, document the fact that libpq now prefers GSS encryption over SSL encryption if both are possible. Per report from Mikael Gustavsson. Back-patch to v12 where this code was introduced. Discussion:

  • Improve log messages related to pg_hba.conf not matching a connection. Include details on whether GSS encryption has been activated; since we added "hostgssenc" type HBA entries, that's relevant info. Kyotaro Horiguchi and Tom Lane. Back-patch to v12 where GSS encryption was introduced. Discussion:

  • Suppress log spam from multiple reports of SIGQUIT shutdown. When the postmaster sends SIGQUIT to its children, there's no real need for all the children to log that fact; the postmaster already made a log entry about it, so adding perhaps dozens or hundreds of child-process log entries adds nothing of value. So, let's introduce a new ereport level to specify "WARNING, but never send to log" and use that for these messages. Such a change wouldn't have been desirable before commit 7e784d1dc, because if someone manually SIGQUIT's a backend, we do want to log that. But now we can tell the difference between a signal that was issued by the postmaster and one that was not with reasonable certainty. While we're here, also clear error_context_stack before ereport'ing, to prevent error callbacks from being invoked in the signal-handler context. This should reduce the odds of getting hung up while trying to notify the client. Per a suggestion from Andres Freund. Discussion:

  • Doc: fix up PDF build warnings from over-width table columns. Addition of multirange info to tables 8.27 and 65.1 made them start throwing "exceed the available area" warnings in PDF docs builds. For 8.27, twiddling the existing column width hints was enough to fix this. For 65.1, I twiddled the widths a little, but to really fix it I had to insert a space after each comma in the table, to allow a line break to occur there. (This seemed easier to read and maintain than the alternative of inserting &zws;; entities.) Per buildfarm.

  • Fix up usage of krb_server_keyfile GUC parameter. secure_open_gssapi() installed the krb_server_keyfile setting as KRB5_KTNAME unconditionally, so long as it's not empty. However, pg_GSS_recvauth() only installed it if KRB5_KTNAME wasn't set already, leading to a troubling inconsistency: in theory, clients could see different sets of server principal names depending on whether they use GSSAPI encryption. Always using krb_server_keyfile seems like the right thing, so make both places do that. Also fix up secure_open_gssapi()'s lack of a check for setenv() failure --- it's unlikely, surely, but security-critical actions are no place to be sloppy. Also improve the associated documentation. This patch does nothing about secure_open_gssapi()'s use of setenv(), and indeed causes pg_GSS_recvauth() to use it too. That's nominally against project portability rules, but since this code is only built with --with-gssapi, I do not feel a need to do something about this in the back branches. A fix will be forthcoming for HEAD though. Back-patch to v12 where GSSAPI encryption was introduced. The dubious behavior in pg_GSS_recvauth() goes back further, but it didn't have anything to be inconsistent with, so let it be. Discussion:

  • Use setenv() in preference to putenv(). Since at least 2001 we've used putenv() and avoided setenv(), on the grounds that the latter was unportable and not in POSIX. However, POSIX added it that same year, and by now the situation has reversed: setenv() is probably more portable than putenv(), since POSIX now treats the latter as not being a core function. And setenv() has cleaner semantics too. So, let's reverse that old policy. This commit adds a simple src/port/ implementation of setenv() for any stragglers (we have one in the buildfarm, but I'd not be surprised if that code is never used in the field). More importantly, extend win32env.c to also support setenv(). Then, replace usages of putenv() with setenv(), and get rid of some ad-hoc implementations of setenv() wannabees. Also, adjust our src/port/ implementation of unsetenv() to follow the POSIX spec that it returns an error indicator, rather than returning void as per the ancient BSD convention. I don't feel a need to make all the call sites check for errors, but the portability stub ought to match real-world practice. Discussion:

  • More fixups for pg_upgrade cross-version tests. Commit 7ca37fb04 removed regress_putenv from the library, so reloading a SQL function dependent on that would not work. Fix similarly to 52202bb39. Per buildfarm.

  • Doc: spell out comparison behaviors for the date/time types. The behavior of cross-type comparisons among date/time data types was not really explained anywhere. You could probably infer it if you recognized the applicability of comments elsewhere about datatype conversions, but it seems worthy of explicit documentation. Per bug #16797 from Dana Burd. Discussion:

  • Doc: improve explanation of EXTRACT(EPOCH) for timestamp without tz. Try to be clearer about what computation is actually happening here. Per bug #16797 from Dana Burd. Discussion:

Alexander Korotkov pushed:

Noah Misch pushed:

Amit Kapila pushed:

  • Extend the output plugin API to allow decoding of prepared xacts. This adds six methods to the output plugin API, adding support for streaming changes of two-phase transactions at prepare time. * begin_prepare * filter_prepare * prepare * commit_prepared * rollback_prepared * stream_prepare Most of this is a simple extension of the existing methods, with the semantic difference that the transaction is not yet committed and maybe aborted later. Until now two-phase transactions were translated into regular transactions on the subscriber, and the GID was not forwarded to it. None of the two-phase commands were communicated to the subscriber. This patch provides the infrastructure for logical decoding plugins to be informed of two-phase commands Like PREPARE TRANSACTION, COMMIT PREPARED and ROLLBACK PREPARED commands with the corresponding GID. This also extends the 'test_decoding' plugin, implementing these new methods. This commit simply adds these new APIs and the upcoming patch to "allow the decoding at prepare time in ReorderBuffer" will use these APIs. Author: Ajin Cherian and Amit Kapila based on previous work by Nikhil Sontakke and Stas Kelvich Reviewed-by: Amit Kapila, Peter Smith, Sawada Masahiko, and Dilip Kumar Discussion:

Peter Geoghegan pushed:

  • Fix index deletion latestRemovedXid bug. The logic for determining the latest removed XID for the purposes of generating recovery conflicts in REDO routines was subtly broken. It failed to follow links from HOT chains, and so failed to consider all relevant heap tuple headers in some cases. To fix, expand the loop that deals with LP_REDIRECT line pointers to also deal with HOT chains. The new version of the loop is loosely based on a similar loop from heap_prune_chain(). The impact of this bug is probably quite limited, since the horizon code necessarily deals with heap tuples that are pointed to by LP_DEAD-set index tuples. The process of setting LP_DEAD index tuples (e.g. within the kill_prior_tuple mechanism) is highly correlated with opportunistic pruning of pointed-to heap tuples. Plus the question of generating a recovery conflict usually comes up some time after index tuple LP_DEAD bits were initially set, unlike heap pruning, where a latestRemovedXid is generated at the point of the pruning operation (heap pruning has no deferred "would-be page split" style processing that produces conflicts lazily). Only backpatch to Postgres 12, the first version where this logic runs during original execution (following commit 558a9165e08). The index latestRemovedXid mechanism has had the same bug since it first appeared over 10 years ago (in commit a760893d), but backpatching to all supported versions now seems like a bad idea on balance. Running the new improved code during recovery seems risky, especially given the lack of complaints from the field. Author: Peter Geoghegan Discussion: Backpatch: 12-

  • Get heap page max offset with buffer lock held. On further reflection it seems better to call PageGetMaxOffsetNumber() after acquiring a buffer lock on the page. This shouldn't really matter, but doing it this way is cleaner. Follow-up to commit 42288174. Backpatch: 12-, just like commit 42288174

Pending Patches

Noah Misch sent in a patch to merge similar algorithms into roles_is_member_of().

Vigneshwaran C sent in a patch atop the existing patch to parallelize parts of COPY which delegates the task of finding line boundaries to workers.

Bharath Rupireddy sent in a patch to implement EXPLAIN [ANALYZE] for REFRESH MATERIALIZED VIEW.

Masahiko Sawada sent in another revision of a patch to add new FDW APIs to support 2PC, introduce a global transaction manager, and implement those FDW APIs in the PostgreSQL FDW.

Peter Geoghegan sent in another revision of a patch to make the btvacuumstrategy() bottom-up index deletion changes.

Bharath Rupireddy sent in two more revisions of a patch to make it possible for CTAS to use parallel inserts.

Luc Vlaming sent in two revisions of a patch to generate JIT IR code lazily, which issue came up in the case where a lot of JIT IR code was being generated unecessarily for the case of partitioned tables where the IR for many of the partitions was never executed because those partitions got pruned.

Thomas Munro sent in another revision of a patch to make it possible to use parallel hashing to execute Full and Right JOINs.

Andrey Borodin sent in another revision of a patch to add LZ4 as a possible compression scheme for WALL FPIs.

David Rowley sent in two more revisions of a patch to reduce the number of special cases to build contrib modules on Windows.

Noah Misch sent in a patch to dump public schema ownership and security labels.

Paul Martinez sent in a patch to simplify the permission checking logic in user.c.

Bharath Rupireddy sent in another revision of a patch to allow parallel mode in the planning of REFRESH MATERIALIZED VIEW.

Thomas Munro sent in another revision of a patch to track relation sizes in shared memory, this capability being controlled by a new GUC, smgr_shared_relation, which limits the number of the newly introduced pool of SMgrSharedRelation objects, and provide a lock-free fast path for smgrnblocks().

Peter Smith sent in another revision of a patch to allow the table-sync worker to use multiple transactions.

Andrey V. Lepikhov sent in another revision of a patch to speed up COPY for the case where a table has foreign partitions by adding three new routines to the FDW API: BeginForeignCopy, EndForeignCopy, and ExecForeignCopy, and adding same to the PostgreSQL FDW.

Andrey Borodin sent in another revision of a patch to reorganize pglz compression code to make it more efficient by converting macro functions to regular functions for readability, Using a more compact hash table with uint16 indexes instead of pointers, avoiding prev pointer in hash tables, and using 4-byte comparisons in a search instead of 1-byte comparisons.

Justin Pryzby sent in another revision of a patch to implement CREATE TABLE (LIKE .. INCLUDING ACCESS METHOD).

Luc Vlaming sent in another revision of a patch to allow partial UNION ALL, improving parallel subquery costing thereby.

Rui Zhao sent in another revision of a patch to refactor the way is called RelationClose after RelationIdGetRelation.

Peter Eisentraut sent in another revision of a patch to implement dynamic result sets from procedures.

David Fetter sent in two revisions of a patch to surface popcount to SQL.

Joe Wildish sent in another revision of a patch to allow queries in WHEN expression of FOR EACH STATEMENT triggers.

David Fetter sent in another revision of a patch to make it possible to set pg_hba.conf parameters from initdb.

Greg Sabino Mullane sent in another revision of a patch to get psql's \df to help choose functions by input type.

David Fetter and Krasiyan Andreev traded patches to implement NULL treatement for window functions.

Dmitry Dolgov sent in three more revisions of a patch to use the generic type subscripting infrastructure for JSONB.

David Fetter sent in a WIP patch to document the hooks system.

Michael Banck sent in a patch to add a new PGC_ADMINSET guc context and pg_change_role_settings default role, creating a space between 'superuser' and 'user' for GUC contexts.

Álvaro Herrera sent in another revision of a patch to implement MERGE.

Peter Geoghegan sent in another revision of a patch to pass down a "logically unchanged index" hint, and use same to implement bottom-up index deletion.

Soumyadeep Chakraborty sent in another revision of a patch to add a routine to the table AM API that accepts column projection lists.

Bruce Momjian and Michaël Paquier traded patches to move the other hex functions to a common place in the source tree.

Etsuro Fujita sent in two more revisions of a patch to implement asynchronous append on postgres_fdw nodes.

David Fetter sent in another revision of a patch to implement range scans for TIDs.

Pavel Stěhule sent in another revision of a patch to reduce the execution overhead of CALL statements in no atomic mode from PL/pgSQL.

Pavel Stěhule sent in another revision of a patch to implement schema variables.

Josef Šimánek sent in another revision of a patch to add a pg_stat_progress_copy view with COPY progress report.

Pavel Stěhule sent in a patch to make it possible to write window functions in PLs, as they are currently limited to C.

Luc Vlaming sent in a patch to make bulk loading more efficient for heap tables by allocating to each backend a standalone set of blocks that are local to that backend, and reduces the time spent on locking the partition buffers by shifting around the logic to make each set of 128 blocks use the same buffer partition, and then adding a custom function to get buffer blocks specifically for extension, whilst keeping a previous partition lock, thereby reducing the amount of time we spent on futexes.

Michael Banck sent in a patch to move --data-checksums to the common options in initdb's --help output.

Thomas Munro sent in a patch to pgbench to add pthread barrier emulation for platforms that lack it.