PostgreSQL Weekly News - April 18, 2021

Posted on 2021-04-19 by PWN

PostgreSQL Weekly News - April 18, 2021

Person of the week:

PostgreSQL Product News

pgmetrics 1.11, a command-line tool for PostgreSQL metrics, released.

hypopg 1.2.0, an extension which implements hypothetical indexes, released.

PostgreSQL Jobs for April

PostgreSQL in the News

Planet PostgreSQL:

PostgreSQL Weekly News is brought to you this week by David Fetter

Submit news and announcements by Sunday at 3:00pm PST8PDT to

Applied Patches

Tom Lane pushed:

  • Silence some Coverity warnings and improve code consistency. Coverity complained about possible overflow in expressions like intresult = tm->tm_sec * 1000000 + fsec; on the grounds that the multiplication would happen in 32-bit arithmetic before widening to the int64 result. I think these are all false positives because of the limited possible range of tm_sec; but nonetheless it seems silly to spell it like that when nearby lines have the identical computation written with a 64-bit constant. ... or more accurately, with an LL constant, which is not project style. Make all of these use INT64CONST(), as we do elsewhere. This is all new code from a2da77cdb, so no need for back-patch.

  • Fix old bug with coercing the result of a COLLATE expression. There are hacks in parse_coerce.c to push down a requested coercion to below any CollateExpr that may appear. However, we did that even if the requested data type is non-collatable, leading to an invalid expression tree in which CollateExpr is applied to a non-collatable type. The fix is just to drop the CollateExpr altogether, reasoning that it's useless. This bug is ten years old, dating to the original addition of COLLATE support. The lack of field complaints suggests that there aren't a lot of user-visible consequences. We noticed the problem because it would trigger an assertion in DefineVirtualRelation if the invalid structure appears as an output column of a view; however, in a non-assert build, you don't see a crash just a (subtly incorrect) complaint about applying collation to a non-collatable type. I found that by putting the incorrect structure further down in a view, I could make a view definition that would fail dump/reload, per the added regression test case. But CollateExpr doesn't do anything at run-time, so this likely doesn't lead to any really exciting consequences. Per report from Yulin Pei. Back-patch to all supported branches. Discussion:

  • Remove no-longer-relevant test case. was exercising the recording of a collation dependency for an enum comparison expression, but such an expression should never have had any collation dependency in the first place. After I fixed that in commit c402b02b9, the test started failing. We don't need to test that scenario anymore, so just remove the now-useless test steps. (This test case is new in HEAD, so no need to back-patch.) Discussion: Discussion:

  • Avoid improbable PANIC during heap_update. heap_update needs to clear any existing "all visible" flag on the old tuple's page (and on the new page too, if different). Per coding rules, to do this it must acquire pin on the appropriate visibility-map page while not holding exclusive buffer lock; which creates a race condition since someone else could set the flag whenever we're not holding the buffer lock. The code is supposed to handle that by re-checking the flag after acquiring buffer lock and retrying if it became set. However, one code path through heap_update itself, as well as one in its subroutine RelationGetBufferForTuple, failed to do this. The end result, in the unlikely event that a concurrent VACUUM did set the flag while we're transiently not holding lock, is a non-recurring "PANIC: wrong buffer passed to visibilitymap_clear" failure. This has been seen a few times in the buildfarm since recent VACUUM changes that added code paths that could set the all-visible flag while holding only exclusive buffer lock. Previously, the flag was (usually?) set only after doing LockBufferForCleanup, which would insist on buffer pin count zero, thus preventing the flag from becoming set partway through heap_update. However, it's clear that it's heap_update not VACUUM that's at fault here. What's less clear is whether there is any hazard from these bugs in released branches. heap_update is certainly violating API expectations, but if there is no code path that can set all-visible without a cleanup lock then it's only a latent bug. That's not 100% certain though, besides which we should worry about extensions or future back-patch fixes that could introduce such code paths. I chose to back-patch to v12. Fixing RelationGetBufferForTuple before that would require also back-patching portions of older fixes (notably 0d1fe9f74), which is more code churn than seems prudent to fix a hypothetical issue. Discussion:

  • Redesign the caching done by get_cached_rowtype(). Previously, get_cached_rowtype() cached a pointer to a reference-counted tuple descriptor from the typcache, relying on the ExprContextCallback mechanism to release the tupdesc refcount when the expression tree using the tupdesc was destroyed. This worked fine when it was designed, but the introduction of within-DO-block COMMITs broke it. The refcount is logged in a transaction-lifespan resource owner, but plpgsql won't destroy simple expressions made within the DO block (before its first commit) until the DO block is exited. That results in a warning about a leaked tupdesc refcount when the COMMIT destroys the original resource owner, and then an error about the active resource owner not holding a matching refcount when the expression is destroyed. To fix, get rid of the need to have a shutdown callback at all, by instead caching a pointer to the relevant typcache entry. Those survive for the life of the backend, so we needn't worry about the pointer becoming stale. (For registered RECORD types, we can still cache a pointer to the tupdesc, knowing that it won't change for the life of the backend.) This mechanism has been in use in plpgsql and expandedrecord.c since commit 4b93f5799, and seems to work well. This change requires modifying the ExprEvalStep structs used by the relevant expression step types, which is slightly worrisome for back-patching. However, there seems no good reason for extensions to be familiar with the details of these particular sub-structs. Per report from Rohit Bhogate. Back-patch to v11 where within-DO-block COMMITs became a thing. Discussion:

  • Fix some inappropriately-disallowed uses of ALTER ROLE/DATABASE SET. Most GUC check hooks that inspect database state have special checks that prevent them from throwing hard errors for state-dependent issues when source == PGC_S_TEST. This allows, for example, "ALTER DATABASE d SET default_text_search_config = foo" when the "foo" configuration hasn't been created yet. Without this, we have problems during dump/reload or pg_upgrade, because pg_dump has no idea about possible dependencies of GUC values and can't ensure a safe restore ordering. However, check_role() and check_session_authorization() hadn't gotten the memo about that, and would throw hard errors anyway. It's not entirely clear what is the use-case for "ALTER ROLE x SET role = y", but we've now heard two independent complaints about that bollixing an upgrade, so apparently some people are doing it. Hence, fix these two functions to act more like other check hooks with similar needs. (But I did not change their insistence on being inside a transaction, as it's still not apparent that setting either GUC from the configuration file would be wise.) Also fix check_temp_buffers, which had a different form of the disease of making state-dependent checks without any exception for PGC_S_TEST. A cursory survey of other GUC check hooks did not find any more issues of this ilk. (There are a lot of interdependencies among PGC_POSTMASTER and PGC_SIGHUP GUCs, which may be a bad idea, but they're not relevant to the immediate concern because they can't be set via ALTER ROLE/DATABASE.) Per reports from Charlie Hornsby and Nathan Bossart. Back-patch to all supported branches. Discussion: Discussion:

  • Allow table-qualified variable names in ON CONFLICT ... WHERE. Previously you could only use unqualified variable names here. While that's not a functional deficiency, since only the target table can be referenced, it's a surprising inconsistency with the rules for partial-index predicates, on which this syntax is supposedly modeled. The fix for that is no harder than passing addToRelNameSpace = true to addNSItemToQuery. However, it's really pretty bogus for transformOnConflictArbiter and transformOnConflictClause to be messing with the namespace item for the target table at all. It's not theirs to manage, it results in duplicative creations of namespace items, and transformOnConflictClause wasn't even doing it quite correctly (that coding resulted in two nsitems for the target table, since it hadn't cleaned out the existing one). Hence, make transformInsertStmt responsible for setting up the target nsitem once for both these clauses and RETURNING. Also, arrange for ON CONFLICT ... UPDATE's "excluded" pseudo-relation to be added to the rangetable before we run transformOnConflictArbiter. This produces a more helpful HINT if someone writes "excluded.col" in the arbiter expression. Per bug #16958 from Lukas Eder. Although I agree this is a bug, the consequences are hardly severe, so no back-patch. Discussion:

  • Fix obsolete comments referencing JoinPathExtraData.extra_lateral_rels. That field went away in commit edca44b15, but it seems that commit 45be99f8c re-introduced some comments mentioning it. Noted by James Coleman, though this isn't exactly his proposed new wording. Also thanks to Justin Pryzby for software archaeology. Discussion:

  • Stabilize recently-added information_schema test queries. These queries could show unexpected entries if the core system, or concurrently-running test scripts, created any functions that would appear in the information_schema views. Restrict them to showing functions belonging to this test's schema, as the far-older nearby test case does. Per experimentation with conversion of some built-in functions to SQL-function-body style.

  • Undo decision to allow pg_proc.prosrc to be NULL. Commit e717a9a18 changed the longstanding rule that prosrc is NOT NULL because when a SQL-language function is written in SQL-standard style, we don't currently have anything useful to put there. This seems a poor decision though, as it could easily have negative impacts on external PLs (opening them to crashes they didn't use to have, for instance). SQL-function-related code can just as easily test "is prosqlbody not null" as "is prosrc null", so there's no real gain there either. Hence, revert the NOT NULL marking removal and adjust related logic. For now, we just put an empty string into prosrc for SQL-standard functions. Maybe we'll have a better idea later, although the history of things like pg_attrdef.adsrc suggests that it's not easy to maintain a string equivalent of a node tree. This also adds an assertion that queryDesc->sourceText != NULL to standard_ExecutorStart. We'd been silently relying on that for awhile, so let's make it less silent. Also fix some overlooked documentation and test cases. Discussion:

  • Revert "Cope with NULL query string in ExecInitParallelPlan().". This reverts commit b3ee4c503872f3d0a5d6a7cbde48815f555af15b. We don't need it in the wake of the preceding commit, which added an upstream check that the querystring isn't null. Discussion:

  • Provide query source text when parsing a SQL-standard function body. Without this, we lose error cursor positions, as shown in the modified regression test result. Discussion:

  • Fix wrong units in two ExplainPropertyFloat calls. This is only a latent bug, since these calls are only reached for non-text output formats, and currently none of those will print the units. Still, we should get it right in case that ever changes. Justin Pryzby Discussion:

  • Fix bogus collation-version-recording logic. recordMultipleDependencies had the wrong scope for its "version" variable, allowing a version label to leak from the collation entry it was meant for to subsequent non-collation entries. This is relatively hard to trigger because of the OID-descending order that the inputs will normally arrive in: subsequent non-collation items will tend to be pinned. But it can be exhibited easily with a custom collation. Also, don't special-case the default collation, but instead ignore pinned-ness of a collation when we've found a version for it. This avoids creating useless pg_depend entries, and removes a not-very- future-proof assumption that C, POSIX, and DEFAULT are the only pinned collations. A small problem is that, because the default collation may or may not have a version, the regression tests can't assume anything about whether dependency entries will be made for it. This seems OK though since it's now handled just the same as other collations, and we have test cases for both versioned and unversioned collations. Fixes oversights in commit 257836a75. Thanks to Julien Rouhaud for review. Discussion:

  • Split function definitions out of system_views.sql into a new file. Invent system_functions.sql to carry the function definitions that were formerly in system_views.sql. The function definitions were already a quarter of the file and are about to be more, so it seems appropriate to give them their own home. In passing, fix an oversight in dfb75e478: it neglected to call check_input() for system_constraints.sql. Discussion:

  • Convert built-in SQL-language functions to SQL-standard-body style. Adopt the new pre-parsed representation for all built-in and information_schema SQL-language functions, except for a small number that can't presently be converted because they have polymorphic arguments. This eliminates residual hazards around search-path safety of these functions, and might provide some small performance benefits by reducing parsing costs. It seems useful also to provide more test coverage for the SQL-standard-body feature. Discussion:

  • Update dummy prosrc values. Ooops, forgot to s/system_views.sql/system_functions.sql/g in this part of 767982e36. No need for an additional catversion bump, I think, since these strings are gone by the time initdb finishes. Discussion:

  • Rethink extraction of collation dependencies. As it stands, find_expr_references_walker() pays attention to leaf-node collation fields while ignoring the input collations of actual function and operator nodes. That seems exactly backwards from a semantic standpoint, and it leads to reporting dependencies on collations that really have nothing to do with the expression's behavior. Hence, rewrite to look at function input collations instead. This isn't completely perfect either; it fails to account for the behavior of record_eq and its siblings. (The previous coding at least gave an approximation of that, though I think it could be fooled pretty easily into considering the columns of irrelevant composite types.) We may be able to improve on this later, but for now this should satisfy the buildfarm members that didn't like ef387bed8. In passing fix some oversights in GetTypeCollations(), and get rid of its duplicative de-duplications. (I'm worried that it's still potentially O(N^2) or worse, but this makes it a little better.) Discussion:

Michaël Paquier pushed:

  • Fix out-of-bound memory access for interval -> char conversion. Using Roman numbers (via "RM" or "rm") for a conversion to calculate a number of months has never considered the case of negative numbers, where a conversion could easily cause out-of-bound memory accesses. The conversions in themselves were not completely consistent either, as specifying 12 would result in NULL, but it should mean XII. This commit reworks the conversion calculation to have a more consistent behavior: - If the number of months and years is 0, return NULL. - If the number of months is positive, return the exact month number. - If the number of months is negative, do a backward calculation, with -1 meaning December, -2 November, etc. Reported-by: Theodor Arsenij Larionov-Trichkin Author: Julien Rouhaud Discussion: backpatch-through: 9.6

  • Move log_autovacuum_min_duration into its correct sections. This GUC has already been classified as LOGGING_WHAT, but its location in postgresql.conf.sample and the documentation did not reflect that, so fix those inconsistencies. Author: Justin Pryzby Discussion:

  • Remove duplicated --no-sync switches in new tests of test_pg_dump. These got introduced in 6568cef. Reported-by: Noah Misch Discussion:

  • Simplify tests of postgres_fdw terminating connections. The tests introduced in 32a9c0b for connections broken and re-established rely on pg_terminate_backend() for their logic. When these were introduced, this function simply sent a signal to a backend without waiting for the operation to complete, and the tests repeatedly looked at pg_stat_activity to check if the operation was completed or not. Since aaf0432, it is possible to define a timeout to make pg_terminate_backend() wait for a certain duration, so make use of it, with a timeout reasonably large enough (3min) to give enough room for the tests to pass even on slow machines. Some measurements show that the tests of postgres_fdw are much faster with this change. For example, on my laptop, they now take 4s instead of 6s. Author: Bharath Rupireddy Discussion:

  • doc: Move force_parallel_mode to section for developer options. This GUC has always been classified as a planner option since its introduction in 7c944bd, and was listed in postgresql.conf.sample. As this parameter exists for testing purposes, move it to the section dedicated to developer parameters and hence remove it from postgresql.conf.sample. This will avoid any temptation to play with it on production servers for users that should never really have to touch this parameter. The general description used for developer options is reworded a bit, to take into account the inclusion of force_parallel_mode, per a suggestion from Tom Lane. Per discussion between Tom Lane, Bruce Momjian, Justin Pryzby, Bharath Rupireddy and me. Author: Justin Pryzby, Tom Lane Discussion:

  • Tweak behavior of pg_dump --extension with configuration tables. 6568cef, that introduced the option, had an inconsistent behavior when it comes to configuration tables set up by pg_extension_config_dump, as the data of all configuration tables would included in a dump even for extensions not listed by a set of --extension switches. The contents dumped changed depending on the schema where an extension was installed when an extension was not listed. For example, an extension installed under the public schema would have its configuration data not dumped even when not listed with --extension, which was inconsistent with the case of an extension installed on a non-public schema, where the configuration would be dumped. Per discussion with Noah, we have settled down to the simple rule of dumping configuration data of an extension if it is listed in --extension (default is unchanged and backward-compatible, to dump everything on sight if there are no extensions directly listed). This avoids some weird cases where the dumps depended on a --schema for one. More tests are added to cover the gap, where we cross-check more behaviors depending on --schema when an extension is not listed. Reported-by: Noah Misch Reviewed-by: Noah Misch Discussion:

  • doc: Simplify example of HISTFILE for psql. e4c7619 has added a space to the example used for HISTFILE in the docs of psql before the variable DBNAME, as a workaround because variables were not parsed the same way back then. This behavior has changed in 9.2, causing the example in the psql docs to result in the same history file created with or without a space added before the DBNAME variable. Let's just remove this space in the example, to reduce any confusion, as the point of it is to prove that a per-database history file is easy to set up, and that's easier to read this way. Per discussion with Tom Lane. Reported-by: Ludovic Kuty Discussion:

  • doc: Fix typo in example query of SQL/JSON. Author: Erik Rijkers Discussion: Backpatch-through: 12

Amit Kapila pushed:

Fujii Masao pushed:

Peter Eisentraut pushed:

Thomas Munro pushed:

Noah Misch pushed:

Robert Haas pushed:

  • docs: Update TOAST storage docs for configurable compression. Mention that there are multiple TOAST compression methods and that the compression method used is stored in a TOAST pointer along with the other information that was stored there previously. Add a reference to the documentation for default_toast_compression, where the supported methods are listed, instead of duplicating that here. I haven't tried to preserve the text claiming that pglz is "fairly simple and very fast." I have no view on the veracity of the former claim, but LZ4 seems to be faster (and to compress better) so it seems better not to muddy the waters by talking about compression speed as a strong point of PGLZ. Patch by me, reviewed by Justin Pryzby. Discussion:

  • amcheck: Reword some messages and fix an alignment problem. We don't need to mention the attribute number in these messages, because there's a dedicated column for that, but we should mention the toast value ID, because that's really useful for any follow-up troubleshooting the user wants to do. This also rewords some of the messages to hopefully read a little better. Also, use VARATT_EXTERNAL_GET_POINTER in case we're accessing a TOAST pointer that isn't aligned on a platform that's fussy about alignment, so that we don't crash while corruption-checking the user's data. Mark Dilger, reviewed by me. Discussion:

Peter Geoghegan pushed:

Tomáš Vondra pushed:

Andrew Dunstan pushed:

  • Allow TestLib::slurp_file to skip contents, and use as needed. In order to avoid getting old logfile contents certain functions in PostgresNode were doing one of two things. On Windows it rotated the logfile and restarted the server, while elsewhere it truncated the log file. Both of these are unnecessary. We borrow from the buildfarm which does this instead: note the size of the logfile before we start, and then when fetching the logfile skip to that position before accumulating contents. This is spelled differently on Windows but the effect is the same. This is largely centralized in TestLib's slurp_file function, which has a new optional parameter, the offset to skip to before starting to reading the file. Code in the client becomes much neater. Backpatch to all live branches. Michael Paquier, slightly modified by me. Discussion:

Pending Patches

Andy Fan sent in a patch to document the enhancement for RelOptInfo.partexprs and nullable_partexprs, and split gen_prune_steps_from_exprs into some smaller chunks.

Luc Vlaming sent in another revision of a patch to add explicit partial UNION ALL path and improve parallel subquery rowcounts and costing.

Luc Vlaming sent in another revision of a patch to improve jitting performance by not emitting the LLVMPassManagerBuilderUseInlinerWithThreshold pass, and generate the IR code lazily on first call.

Amul Sul sent in another revision of a patch to implement ALTER SYSTEM ... READ {ONLY | WRITE}.

James Coleman sent in a patch to add HashedScalarArrayOp support for NOT IN.

Amit Langote sent in a patch to fix a thinko in PartitionDesc includes_detached.

Masahiko Sawada sent in a patch to fix a performance degradation of REFRESH MATERIALIZED VIEW.

Peter Smith sent in another revision of a patch to add support for prepared transactions to built-in logical replication, and add prepare API support for streaming transactions.

Vigneshwaran C and traded patches to identify missing publications from publisher in CREATE/ALTER SUBSCRIPTION.

Vigneshwaran C sent in a patch to fix a monitoring stats docs inconsistency.

Bruce Momjian sent in two revisions of a patch to fix some naming in the latest addition of query IDs to pg_stat_activity.

Mark Dilger sent in two more revisions of a patch to the pg_amcheck application to add TOAST pointer corruption checks.

Melanie Plageman sent in a patch to add a system view, pg_stat_buffers_written, which does what it says on the label.

Fujii Masao, Kyotaro HORIGUCHI, and Justin Pryzby traded patches to fix some infelicities in the new TRUNCATE on foreign table capability.

Justin Pryzby sent in two more revisions of a patch to polish up the documentation for 14.

Amit Langote sent in a patch to initialize WITH CHECK OPTIONS and RETURNING expressions on demand.

Ekaterina Sokolova sent in another revision of a patch to add extra statistics to explain for Nested Loop.

Haiying Tang sent in another revision of a patch to support tab completion with a query result for upper character inputs in psql.

Dave Page sent in a patch to fix a logging issue in sepgsql.

Andrei Zubkov sent in another revision of a patch to track statement entry timestamp in pg_stat_statements.

David Christensen sent in a patch to expand the units that pg_size_pretty(numeric) knows about to include units up to yottabytes.

Bharath Rupireddy sent in two more revisions of a patch to avoid accessing system catalogues inside slot_store_error_callback and conversion_error_callback.

Masahiko Sawada and Peter Geoghegan traded patches to make sure that the code comments for VACUUM accounting explain LP_DEAD accounting.

Ajin Cherian sent in two more revisions of a patch to skip empty transactions for logical replication.

Julien Rouhaud sent in a patch to handle some corner cases around logging with respect to query IDs. The issue here is that some queries can be logged when syntactically valid, missing checks for existence of the objects to which they refer in the catalog. The fix adds a %Q option to log_line_prefix which checks whether the object(s) are actually found.

Zeng Wenjing and Shawn Wang traded patches to implement global temporary tables.

Tom Lane sent in a patch to replace pg_depend PIN entries with a fixed range check.

Amul Sul sent in a patch to remove a edundant variable from transformCreateStmt.

Pavel Stěhule sent in two more revisions of a patch to implement schema variables.

Matthias van de Meent sent in two revisions of a patch to document the locking behavior of ATTACH PARTITION for default partitions.

Matthias van de Meent sent in a patch to implement page-level dynamic prefix truncation for _bt_binsrch*, then use same to implement and use index tuple attribute iteration.

James Coleman, Tomáš Vondra, and Tom Lane traded patches intended to fix a bug that manifested as "could not find pathkey item to sort" for TPC-DS queries 94-96.

Masahiro Ikeda sent in another revision of a patch to improve the performance of reporting WAL stats.

Tomáš Vondra sent in a patch to generate fractional cheapest paths in generate_orderedappend_path.

Takamichi Osumi and Li Japin traded patches to fix an infelicity between TRUNCATE and synchronous logical replication.

Michaël Paquier sent in another revision of a patch to fix a relcache leak in the logical replication worker.

Mark Dilger sent in two revisions of a patch to add an --install-missing option to pg_amcheck.

Sven Klemm sent in a patch to fix pg_event_trigger_ddl_commands, which could be looking at an invalid cache in the case of an object dropped in the same transaction.

Andy Fan sent in a patch to ensure that set_append_rel_size considers the initial partition prune.

Tom Lane sent in a patch to save some cycles in repeated subquery pullup.