PostgreSQL Weekly News - January 10, 2021

Posted on 2021-01-11 by PWN

PostgreSQL Weekly News - January 10, 2021

PostgreSQL Product News

pg_back 1.10, a backup script for PostgreSQL, released.

pgagroal 1.1.0, a high-performance protocol-native connection pool for PostgreSQL, released.

Veil2 0.9.2 beta, a database security add-on for Postgres that provides a framework for implementing Virtual Private Databases with row level security, released.

PostgreSQL Jobs for January

PostgreSQL in the News

Planet PostgreSQL:

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm PST8PDT to

Applied Patches

Amit Kapila pushed:

Michaël Paquier pushed:

Tom Lane pushed:

  • Add the ability for the core grammar to have more than one parse target. This patch essentially allows gram.y to implement a family of related syntax trees, rather than necessarily always parsing a list of SQL statements. raw_parser() gains a new argument, enum RawParseMode, to say what to do. As proof of concept, add a mode that just parses a TypeName without any other decoration, and use that to greatly simplify typeStringToTypeName(). In addition, invent a new SPI entry point SPI_prepare_extended() to allow SPI users (particularly plpgsql) to get at this new functionality. In hopes of making this the last variant of SPI_prepare(), set up its additional arguments as a struct rather than direct arguments, and promise that future additions to the struct can default to zero. SPI_prepare_cursor() and SPI_prepare_params() can perhaps go away at some point. Discussion:

  • Re-implement pl/pgsql's expression and assignment parsing. Invent new RawParseModes that allow the core grammar to handle pl/pgsql expressions and assignments directly, and thereby get rid of a lot of hackery in pl/pgsql's parser. This moves a good deal of knowledge about pl/pgsql into the core code: notably, we have to invent a CoercionContext that matches pl/pgsql's (rather dubious) historical behavior for assignment coercions. That's getting away from the original idea of pl/pgsql as an arm's-length extension of the core, but really we crossed that bridge a long time ago. The main advantage of doing this is that we can now use the core parser to generate FieldStore and/or SubscriptingRef nodes to handle assignments to pl/pgsql variables that are records or arrays. That fixes a number of cases that had never been implemented in pl/pgsql assignment, such as nested records and array slicing, and it allows pl/pgsql assignment to support the datatype-specific subscripting behaviors introduced in commit c7aba7c14. There are cosmetic benefits too: when a syntax error occurs in a pl/pgsql expression, the error report no longer includes the confusing "SELECT" keyword that used to get prefixed to the expression text. Also, there seem to be some small speed gains. Discussion:

  • Remove PLPGSQL_DTYPE_ARRAYELEM datum type within pl/pgsql. In the wake of the previous commit, we don't really need this anymore, since array assignment is primarily handled by the core code. The only way that that code could still be reached is that a GET DIAGNOSTICS target variable could be an array element. But that doesn't seem like a particularly essential feature. I'd added it in commit 55caaaeba, but just because it was easy not because anyone had actually asked for it. Hence, revert that patch and then remove the now-unreachable stuff. (If we really had to, we could probably reimplement GET DIAGNOSTICS using the new assignment machinery; but the cost/benefit ratio looks very poor, and it'd likely be a bit slower.) Note that PLPGSQL_DTYPE_RECFIELD remains. It's possible that we could get rid of that too, but maintaining the existing behaviors for RECORD-type variables seems like it might be difficult. Since there's not any functional limitation in those code paths as there was in the ARRAYELEM code, I've not pursued the idea. Discussion:

  • Rethink the "read/write parameter" mechanism in pl/pgsql. Performance issues with the preceding patch to re-implement array element assignment within pl/pgsql led me to realize that the read/write parameter mechanism is misdesigned. Instead of requiring the assignment source expression to be such that all its references to the target variable could be passed as R/W, we really want to identify one reference to the target variable to be passed as R/W, allowing any other ones to be passed read/only as they would be by default. As long as the R/W reference is a direct argument to the top-level (hence last to be executed) function in the expression, there is no harm in R/O references being passed to other lower parts of the expression. Nor is there any use-case for more than one argument of the top-level function being R/W. Hence, rewrite that logic to identify one single Param that references the target variable, and make only that Param pass a read/write reference, not any other Params referencing the target variable. Discussion:

  • Fix integer-overflow corner cases in substring() functions. If the substring start index and length overflow when added together, substring() misbehaved, either throwing a bogus "negative substring length" error on a case that should succeed, or failing to complain that a negative length is negative (and instead returning the whole string, in most cases). Unsurprisingly, the text, bytea, and bit variants of the function all had this issue. Rearrange the logic to ensure that negative lengths are always rejected, and add an overflow check to handle the other case. Also install similar guards into detoast_attr_slice() (nee heap_tuple_untoast_attr_slice()), since it's far from clear that no other code paths leading to that function could pass it values that would overflow. Patch by myself and Pavel Stehule, per bug #16804 from Rafi Shamim. Back-patch to v11. While these bugs are old, the common/int.h infrastructure for overflow-detecting arithmetic didn't exist before commit 4d6ad3125, and it doesn't seem like these misbehaviors are bad enough to justify developing a standalone fix for the older branches. Discussion:

  • Introduce a new GUC_REPORT setting "in_hot_standby". Aside from being queriable via SHOW, this value is sent to the client immediately at session startup, and again later on if the server gets promoted to primary during the session. The immediate report will be used in an upcoming patch to avoid an extra round trip when trying to connect to a primary server. Haribabu Kommi, Greg Nancarrow, Tom Lane; reviewed at various times by Laurenz Albe, Takayuki Tsunakawa, Peter Smith. Discussion:

  • Allow psql's \dt and \di to show TOAST tables and their indexes. Formerly, TOAST objects were unconditionally suppressed, but since \d is able to print them it's not very clear why these variants should not. Instead, use the same rules as for system catalogs: they can be seen if you write the 'S' modifier or a table name pattern. (In practice, since hardly anybody would keep pg_toast in their search_path, it's really down to whether you use a pattern that can match pg_toast.*.) No docs change seems necessary because the docs already say that this happens for "system objects"; we're just classifying TOAST tables as being that. Justin Pryzby, reviewed by Laurenz Albe Discussion:

  • Revert unstable test cases from commit 7d80441d2. I momentarily forgot that the "owner" column wouldn't be stable in the buildfarm. Oh well, these tests weren't very valuable anyway. Discussion:

  • Add a test module for the regular expression package. This module provides a function test_regex() that is functionally rather like regexp_matches(), but with additional debugging-oriented options and additional output. The debug options are somewhat obscure; they are chosen to match the API of the test harness that Henry Spencer wrote way-back-when for use in Tcl. With this, we can import all the test cases that Spencer wrote originally, even for regex functionality that we don't currently expose in Postgres. This seems necessary because we can no longer rely on Tcl to act as upstream and verify any fixes or improvements that we make. In addition to Spencer's tests, I added a few for lookbehind constraints (which we added in 2015, and Tcl still hasn't absorbed) that are modeled on his tests for lookahead constraints. After looking at code coverage reports, I also threw in a couple of tests to more fully exercise our "high colormap" logic. According to my testing, this brings the check-world coverage for src/backend/regex/ from 71.1% to 86.7% of lines. ( shows a slightly different number, which I think is because it measures a non-assert build.) Discussion:

  • Add idle_session_timeout. This GUC variable works much like idle_in_transaction_session_timeout, in that it kills sessions that have waited too long for a new client query. But it applies when we're not in a transaction, rather than when we are. Li Japin, reviewed by David Johnston and Hayato Kuroda, some fixes by me Discussion:

  • Improve timeout.c's handling of repeated timeout set/cancel. A very common usage pattern is that we set a timeout that we don't expect to reach, cancel it after a little bit, and later repeat. With the original implementation of timeout.c, this results in one setitimer() call per timeout set or cancel. We can do a lot better by being lazy about changing the timeout interrupt request, namely: (1) never cancel the outstanding interrupt, even when we have no active timeout events; (2) if we need to set an interrupt, but there already is one pending at or before the required time, leave it alone. When the interrupt happens, the signal handler will reschedule it at whatever time is then needed. For example, with a one-second setting for statement_timeout, this method results in having to interact with the kernel only a little more than once a second, no matter how many statements we execute in between. The mainline code might never call setitimer() at all after the first time, while each time the signal handler fires, it sees that the then-pending request is most of a second away, and that's when it sets the next interrupt request for. Each mainline timeout-set request after that will observe that the time it wants is past the pending interrupt request time, and do nothing. This also works pretty well for cases where a few different timeout lengths are in use, as long as none of them are very short. But that describes our usage well. Idea and original patch by Thomas Munro; I fixed a race condition and improved the comments. Discussion:

  • Improve commentary in timeout.c. On re-reading I realized that I'd missed one race condition in the new timeout code. It's safe, but add a comment explaining it. Discussion:

  • Fix bogus link in test comments. I apparently copied-and-pasted the wrong link in commit ca8217c10. Point it where it was meant to go.

  • Further second thoughts about idle_session_timeout patch. On reflection, the order of operations in PostgresMain() is wrong. These timeouts ought to be shut down before, not after, we do the post-command-read CHECK_FOR_INTERRUPTS, to guarantee that any timeout error will be detected there rather than at some ill-defined later point (possibly after having wasted a lot of work). This is really an error in the original idle_in_transaction_timeout patch, so back-patch to 9.6 where that was introduced.

  • Adjust createdb TAP tests to work on recent OpenBSD. We found last February that the error-case tests added by commit 008cf0409 failed on OpenBSD, because that platform doesn't really check locale names. At the time it seemed that that was only an issue for LC_CTYPE, but testing on a more recent version of OpenBSD shows that it's now equally lax about LC_COLLATE. Rather than dropping the LC_COLLATE test too, put back LC_CTYPE (reverting c4b0edb07), and adjust these tests to accept the different error message that we get if setlocale() doesn't reject a bogus locale name. The point of these tests is not really what the backend does with the locale name, but to show that createdb quotes funny locale names safely; so we're not losing test reliability this way. Back-patch as appropriate. Discussion:

  • Fix ancient bug in parsing of BRE-mode regular expressions. brenext(), when parsing a '*' quantifier, forgot to return any "value" for the token; per the equivalent case in next(), it should return value 1 to indicate that greedy rather than non-greedy behavior is wanted. The result is that the compiled regexp could behave like 'x*?' rather than the intended 'x*', if we were unlucky enough to have a zero in v->nextvalue at this point. That seems to happen with some reliability if we have '.*' at the beginning of a BRE-mode regexp, although that depends on the initial contents of a stack-allocated struct, so it's not guaranteed to fail. Found by Alexander Lakhin using valgrind testing. This bug seems to be aboriginal in Spencer's code, so back-patch all the way. Discussion:

  • Fix plpgsql tests for debug_invalidate_system_caches_always. Commit c9d529848 resulted in having a couple more places where the error context stack for a failure varies depending on debug_invalidate_system_caches_always (nee CLOBBER_CACHE_ALWAYS). This is not very surprising, since we have to re-parse cached plans if the plan cache is clobbered. Stabilize the expected test output by hiding the context stack in these places, as we've done elsewhere in this test script. (Another idea worth considering, now that we have debug_invalidate_system_caches_always, is to force it to zero for these test cases. That seems like it'd risk reducing the coverage of cache-clobber testing, which might or might not be worth being able to verify that we get the expected error output in normal cases. For the moment I just stuck with the existing technique.) In passing, update comments that referred to CLOBBER_CACHE_ALWAYS. Per buildfarm member hyrax.

Thomas Munro pushed:

Peter Geoghegan pushed:

Peter Eisentraut pushed:

Dean Rasheed pushed:

Bruce Momjian pushed:

Fujii Masao pushed:

  • doc: Fix description about default behavior of recovery_target_timeline. The default value of recovery_target_timeline was changed in v12, but the description about the default behavior of that was not updated. Back-patch to v12 where the default behavior of recovery_target_timeline was changed. Author: Benoit Lobréau Reviewed-by: Fujii Masao Discussion:

  • Detect the deadlocks between backends and the startup process. The deadlocks that the recovery conflict on lock is involved in can happen between hot-standby backends and the startup process. If a backend takes an access exclusive lock on the table and which finally triggers the deadlock, that deadlock can be detected as expected. On the other hand, previously, if the startup process took an access exclusive lock and which finally triggered the deadlock, that deadlock could not be detected and could remain even after deadlock_timeout passed. This is a bug. The cause of this bug was that the code for handling the recovery conflict on lock didn't take care of deadlock case at all. It assumed that deadlocks involving the startup process and backends were able to be detected by the deadlock detector invoked within backends. But this assumption was incorrect. The startup process also should have invoked the deadlock detector if necessary. To fix this bug, this commit makes the startup process invoke the deadlock detector if deadlock_timeout is reached while handling the recovery conflict on lock. Specifically, in that case, the startup process requests all the backends holding the conflicting locks to check themselves for deadlocks. Back-patch to v9.6. v9.5 has also this bug, but per discussion we decided not to back-patch the fix to v9.5. Because v9.5 doesn't have some infrastructure codes (e.g., 37c54863cf) that this bug fix patch depends on. We can apply those codes for the back-patch, but since the next minor version release is the final one for v9.5, it's risky to do that. If we unexpectedly introduce new bug to v9.5 by the back-patch, there is no chance to fix that. We determined that the back-patch to v9.5 would give more risk than gain. Author: Fujii Masao Reviewed-by: Bertrand Drouvot, Masahiko Sawada, Kyotaro Horiguchi Discussion:

  • Add GUC to log long wait times on recovery conflicts. This commit adds GUC log_recovery_conflict_waits that controls whether a log message is produced when the startup process is waiting longer than deadlock_timeout for recovery conflicts. This is useful in determining if recovery conflicts prevent the recovery from applying WAL. Note that currently a log message is produced only when recovery conflict has not been resolved yet even after deadlock_timeout passes, i.e., only when the startup process is still waiting for recovery conflict even after deadlock_timeout. Author: Bertrand Drouvot, Masahiko Sawada Reviewed-by: Alvaro Herrera, Kyotaro Horiguchi, Fujii Masao Discussion:

Tomáš Vondra pushed:

Pending Patches

Atsushi Torikoshi sent in another revision of a patch to implement pg_get_target_backend_memory_contexts() and make it possible to collect memory contexts of the specified process.

Atsushi Torikoshi sent in another revision of a patch to add a wait_start column to the pg_locks view.

Mark Zhao sent in a patch intended to fix a bug that manifested as logical replication on partitioned tables being very slow and consuming a lot of CPU by adding a missing RelationClose after RelationIdGetRelation in pgoutput.c.

Önder Kalacı sent in another revision of a patch to implement row filtering for logical replication.

Justin Pryzby sent in a patch to Allow errors in parameter values to be reported during the BIND phase.

Pavel Stěhule sent in another revision of a patch to make it possible to make it possible to write window functions in PLs, along with an implementation of same in PL/pgsql.

Bharath Rupireddy sent in three more revisions of a patch to make it possible to use parallel inserts in CTAS.

Kyotaro HORIGUCHI sent in four more revisions of a patch intended to fix a bug that manifested as failure of a standby to follow a timeline switch by ensuring that the Walsender tracks timeline switches while sending a historic timeline.

Peter Smith sent in four more revisions of a patch to make it possible to use multiple tablesync workers.

Dilip Kumar sent in another revision of a patch to add options for custom table compression methods.

Dmitry Dolgov sent in three more revisions of a patch to use the generic subscripting infrastructure for JSONB operations.

Justin Pryzby sent in another revision of a patch to support multiple compression methods and options for same in pg_dump.

Masahiko Sawada sent in a patch to introduce an IndexAM API for choosing index vacuum strategy, use same to choose index vacuum strategy, and skip btree bulkdelete if the index doesn't grow.

Thomas Munro sent in another revision of a patch to reduce the WaitEventSet syscall churn.

Pavel Stěhule sent in a patch to add an option to use a shorthand for argument and local variable references in PL/pgsql.

Dmitry Dolgov sent in another revision of a patch to Prevent jumbling of every element in ArrayExpr in order to keep pg_stat_statements from producing different entries for what are essentially similar queries.

Tom Lane sent in a PoC patch to deal with MacOS's SIP infrastructure works for dynamic libraries.

Amit Kapila sent in a patch to track replication origin progress for rollbacks for some cases the patch for tracking 2PC in logical replication missed.

Paul Martinez sent in a patch to add partial foreign key updates in referential integrity triggers.

Bruce Momjian sent in two more revisions of a patch to consolidate more of the hex functions in /common.

Shinya Kato, Masahiko Sawada, and Fujii Masao traded patches to fill out the implementation of CLOSE, FETCH, and MOVE tab completion in psql.

Daniel Gustafsson sent in two more revisions of a patch to support enabling and disabling checksums on running clusters.

Tsutomu Yamada and Tomáš Vondra traded patches to add a family of functions starting with \dX to psql which deals with extended statistics.

Bharath Rupireddy sent in two more revisions of a patch to add a postgres_fdw function to discard cached connections, add a postgres_fdw.keep_connections GUC to control whether connections are cached, and add a similar server-level keep_connection GUC.

Ryo Matsumura sent in a patch atop the libpq tracing patch to fix some oversights in same.

Kyotaro HORIGUCHI sent in another revision of a patch to intended to fix a bug that manifested as corruption during WAL replay by delaying checkpoint completion until_after truncation succeeds.

Greg Sabino Mullane sent in another revision of a patch to enable psql's \df to choose functions by input type.

Movead Li sent in another revision of a patch to fix the waldump size for wal switch.

Kirk Jamison sent in another revision of a patch to make dropping relation buffers more efficient with dlist.

Michaël Paquier sent in another revision of a patch to add SHA1 to the cryptohash infrastructure.

Julien Rouhaud sent in another revision of a patch to move pg_stat_statements query jumbling to core, expose queryid in pg_stat_activity and log_line_prefix, and expose query identifier in verbose explain.

Laurenz Albe sent in two more revisions of a patch to add session statistics to pg_stat_database.

Zeng Wenjing sent in a PoC patch to implement global indexes.

Bharath Rupireddy sent in three more revisions of a patch to implement EXPLAIN [ANALYZE] for REFRESH MATERIALIZED VIEW.

Masahiko Sawada sent in a patch intended to fix a bug that manifested as logical replication worker accesses catalogs in error context callback by storing both the local and the remote type names in SlotErrCallbackArg so that it's possible just to set the names in the error callback without a system cache lookup.

Vigneshwaran C sent in a patch to add schema level support for PUBLICATIONs.

Mark Dilger sent in two more revisions of a patch to add a new pg_amcheck contrib module, which is a command line interface for running amcheck's verifications against tables and indexes.

Thomas Munro sent in a patch to add FreeBSD to the list of platforms that have fdatasync.

Kyotaro HORIGUCHI sent in another revision of a patch to make the stats collector more efficient by replacing the files it used for temporary storage with shared memory.

Michaël Paquier sent in another revision of a patch to refactor HMAC implementations to reduce duplication.

Pavel Stěhule sent in another revision of a patch to reduce the overhead of execution of the CALL statement in no atomic mode from PL/pgSQL.

Kyotaro HORIGUCHI sent in another revision of a patch to make ALTER TABLE SET [UN]LOGGED avoid a heap rewrite, change SET LOGGED when wal_level > minimal so it emits WAL using XLOG_FPI instead of a massive number of HEAP_INSERTs, and allows for the cleanup of files left behind in the crash of the transaction that created it.

Pavel Stěhule sent in a patch to add a way to return the text value of variable content to the PL/pgsql debugging API.

Pavel Stěhule sent in a patch to make it possible to use a special pager for psql's \watch command.

Tomáš Vondra sent in another revision of a patch to make it possible to create extended statistics on expressions.

Simon Riggs sent in four more revisions of a patch to implement system-versioned tables.

Peter Eisentraut sent in another revision of a patch to pageinspect which changes the type of block number arguments to bigint in order to avert overflow.

Bruce Momjian sent in four more revisions of a patch to add tests for key management.

Álvaro Herrera and Tomáš Vondra traded patches to implement MERGE.

Pavel Stěhule and Erik Rijkers traded patches to implement schema variables.

Álvaro Herrera and Justin Pryzby traded patches to implement ALTER TABLE ... DETACH PARTITION CONCURRENTLY.

Noah Misch sent in a patch to fix pg_dump for GRANT OPTION among initial privileges.

Krasiyan Andreev sent in another revision of a patch to implement NULL treatment for window functions.

Michael Banck sent in a patch to fix an issue where psql's \watch is not working correctly in the case where the query in question doesn't return rows.

Thomas Munro sent in a patch to use pg_pwrite() in pg_test_fsync to maintain consistency with what PostgreSQL now does.

Justin Pryzby sent in another revision of a patch to fix some documentation and comments in the patch that implements pluggable compression in libpq.

Noah Misch sent in another revision of a patch intended to fix a bug that manifested as spurious "apparent wraparound" via SimpleLruTruncate() rounding.

Shenhao Wang sent in a patch intended to fix a bug that manifested as invalid data in file backup_label problem on Windows by setting text mode when reading backup_label and tablesapce_map.

Tatsuo Ishii sent in a patch to fix a missing acronym label in the documentation.

Tomáš Vondra sent in another revision of a patch to set PD_ALL_VISIBLE and visibility map bits in COPY FREEZE, making good the lack of page-level flag updates.

Tom Lane sent in a patch intended to fix a bug that manifested as multiple hosts in connection string failed to failover in non-hot standby mode.